Dataset Match

We used the Treatment Outcomes Profile to join users of the C1 (n= 109,756; users=85,048) that were evaluated with the Treatment Outcome Profile (TOP) in the same date of Admission to entries in C1 datasets.


###casos duplicados en fecha de aplicación
#1     1 97779
#2     2  2483
#3     3   149
#4     4    21
#5     5     3
#no hay casos que tengan más de una fecha de admisión Y UNA BD MÁS RECIENTE
#CONS_TOP_df_dup_ENE_2020_prev9%>% dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ap_top)))%>% group_by(concat) %>% dplyr::mutate(n_hash_fech_ap=n(),n_dis_ano=n_distinct(ano_bd))%>% dplyr::filter(n_hash_fech_ap>1,n_dis_ano>1)

invisible(
CONS_TOP_df_dup_ENE_2020_prev9%>% dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ap_top)))%>% group_by(concat) %>% dplyr::mutate(n_hash_fech_ap=n(),n_dis_ano=n_distinct(ano_bd))%>% dplyr::filter(n_hash_fech_ap>1,n_dis_ano>1)
)

CONS_TOP_df_dup_ENE_2020_prev9_match<-
CONS_TOP_df_dup_ENE_2020_prev9%>% dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ap_top)))%>% group_by(concat) %>% dplyr::mutate(n_hash_fech_ap=n(),n_dis_ano=n_distinct(ano_bd))%>% dplyr::filter(n_hash_fech_ap==1)%>%
  dplyr::select(hash_key,concat,total_oh,dosis_oh,total_thc,dosis_thc,total_pbc,dosis_pbc,total_coc,dosis_coc,total_bzd,dosis_bzd,total_otra,dosis_otra,hurto,robo,venta_drogas,rina,total_vif,otro,total_transgresion,salud_psicologica,total_trabajo,total_educacion,salud_fisica,lugar_vivir,vivienda,calidad_vida,etapa_del_tratamiento)

CONS_C1_df_dup_JUL_2020_match_top<-
  CONS_C1_df_dup_JUL_2020%>%
  dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
  dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
  dplyr::filter(!is.na(compromiso_biopsicosocial), etapa_del_tratamiento=="Inicio Tratamiento")%>%
  dplyr::select(-hash_key_TOP)%>%
  dplyr::filter_at(vars(total_oh,dosis_oh,total_thc,dosis_thc,total_pbc,dosis_pbc,total_coc,dosis_coc,total_bzd,dosis_bzd,total_otra,dosis_otra,hurto,robo,venta_drogas,rina,total_vif,otro,total_transgresion,salud_psicologica,total_trabajo,total_educacion,salud_fisica,lugar_vivir,vivienda,calidad_vida,etapa_del_tratamiento), all_vars(!is.na(.)))

    users_top_only_one_app_match<-CONS_TOP_df_dup_ENE_2020_prev9%>% dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ap_top)))%>% group_by(concat) %>% dplyr::mutate(n_hash_fech_ap=n(),n_dis_ano=n_distinct(ano_bd))%>% dplyr::filter(n_hash_fech_ap==1)%>% dplyr::ungroup()%>% dplyr::distinct(hash_key)%>% nrow()

#c("total_oh","dosis_oh","total_thc","dosis_thc","total_pbc","dosis_pbc","total_coc","dosis_coc","total_bzd","dosis_bzd","total_otra","dosis_otra","hurto","robo","venta_drogas","rina","total_vif","otro","total_transgresion","salud_psicologica","total_trabajo","total_educacion","salud_fisica","lugar_vivir","vivienda","calidad_vida")

# etapa_del_tratamiento      n      percent valid_percent
#                Egreso      4 3.470114e-05  0.0005719188
#    Inicio Tratamiento   6948 6.027587e-02  0.9934229339
#  Seguimiento 12 meses      0 0.000000e+00  0.0000000000
#  Seguimiento 15 meses      0 0.000000e+00  0.0000000000
#   Seguimiento 3 meses     34 2.949597e-04  0.0048613097
#   Seguimiento 6 meses      8 6.940227e-05  0.0011438376
#   Seguimiento 9 meses      0 0.000000e+00  0.0000000000
#                  <NA> 108276 9.393251e-01            NA


Of the total entries in TOP dataset (n= 109,756), we did not considered entries in TOP dataset had more than one application in the same date. We selected 6,375 cases that had no differences between the date of application and the date of admission to treatment. Additionally, these applications must had been administered at admission (excluding the 0,6% of applications that were administered at other stage of treatment). Finally, we discarded cases with null values in the variables of interest (see Figure 1).


Explore variables

We defined one outcome variable to indicate early drop-out (abandono_temp) that should vary depending on biopsychosocial compromise. We also collapsed variables related to transgression to norms (transg_norm) into a continuous variable, by counting actions committed in the last 4 weeks. Additionally, we added the variable salud_mean to get the mean of the scores in physical health, quality of life and psychological health. To capture the substance use behaviors, we defined the average pattern of drug use in alcohol, marijuana, cocaine paste base, cocaine sedatives and tranquilizers, and other substances, in terms of scores of quantity in the last four weeks (total_mean). Equally, we defined a score of the average dose of the mean reported by users on each substance. At last, we created the mean of scores of days of attendance to jobs or educational institutions in the last four weeks (trabajo_edu_mean). In case of total_vif_z, their distributions were very skewed, due to the fact that 79.8% had no scores, this is why we decided to recode it into absence of domestic violence (vif).


Values of doses over 50 were discarded and declared missing, due to the great amount of outliers.


Furthermore, we were interested in co-occurrent physical and psychiatric conditions among substance users. Consequently, we generated a variable called dg_cie_10, composed of users that at least had been diagnosed with a psychiatric condition or professionals had been studying a possible diagnosis. Also, we included dg_trs_fisico, which categorizes users without a diagnosed physical condition, and those that had one in study or had been diagnosed with a physical condition. Additionally, we generated the variable dg_cons_dep to identify drug dependence from hazardous consumption.


To capture profiles of consumption, we generated the following variables: oh identifies cases that used alcohol at least one day in the last four weeks. This same criteria was applied to marijuana (thc), cocaine paste base (pbc), cocaine (coc), benzodiazepines (bzd), and other drugs (otra). Users that reported at least two substances according to the aforementioned rule were categorized as polydrug users (polycons).


Lastly, the mean of the scores in many variables were dichotomized between users that had less than the 50% of users, compared to people that showed more scores than the 50% of the population. These were defined as Higher Frequency of Days using substances in the last four weeks (mayor_total), Higher Average Reported Daily Dose of Substances (mayor_dosis), Lower Health-Related Self-Reports (menor_salud), and Lower Days of Attendance to Work and Educational Centers in the last four weeks (menor_trabajo_edu). In a similar manner, transg_norm_bin identify users that reported at least one behavior that transgress the norm.


Table 1. Summary of Variables
Type Variable Missing Complete Rate Ordered No. Unique Top Counts Mean (logical) Counts (logical) Mean SD Min Perc. 25 Median Perc. 75 Max Histogram
factor compromiso_biopsicosocial 0 1.0000000 FALSE 3 2-M: 3380, 3-S: 1909, 1-L: 745 NA NA NA NA NA NA NA NA NA NA
factor cie_10 0 1.0000000 FALSE 4 Dia: 2374, Sin: 2357, En : 1185, Dia: 118 NA NA NA NA NA NA NA NA NA NA
factor dg_trs_fisico 0 1.0000000 FALSE 2 1: 4084, 0: 1950 NA NA NA NA NA NA NA NA NA NA
factor abandono_temp 0 1.0000000 FALSE 2 0: 4855, 1: 1179 NA NA NA NA NA NA NA NA NA NA
factor origen_ingreso_mod 0 1.0000000 FALSE 5 Con: 3238, Sec: 1502, Der: 527, Sec: 486 NA NA NA NA NA NA NA NA NA NA
factor motivodeegreso_mod_imp 0 1.0000000 FALSE 5 Aba: 2089, Alt: 1567, Aba: 1179, Der: 648 NA NA NA NA NA NA NA NA NA NA
factor dg_trs_cons_sus_or 0 1.0000000 FALSE 2 Dep: 4252, Con: 1782 NA NA NA NA NA NA NA NA NA NA
factor diagnostico_trs_fisico 0 1.0000000 FALSE 23 En : 3457, Sin: 1950, Otr: 159, Tra: 112 NA NA NA NA NA NA NA NA NA NA
factor otros_probl_at_sm_or 1,325 0.7804110 FALSE 12 Sin: 2434, Vio: 1515, Otr: 603, Abu: 93 NA NA NA NA NA NA NA NA NA NA
factor sus_principal_mod 0 1.0000000 FALSE 5 Alc: 2309, Pas: 2120, Coc: 1234, Mar: 296 NA NA NA NA NA NA NA NA NA NA
factor otras_sus1_mod 1,963 0.6746768 FALSE 5 Alc: 1657, Mar: 1208, Coc: 703, Pas: 334 NA NA NA NA NA NA NA NA NA NA
factor otras_sus2_mod 3,852 0.3616175 FALSE 5 Mar: 829, Alc: 638, Coc: 414, Pas: 156 NA NA NA NA NA NA NA NA NA NA
factor otras_sus3_mod 3,852 0.3616175 FALSE 5 Mar: 829, Alc: 638, Coc: 414, Pas: 156 NA NA NA NA NA NA NA NA NA NA
factor dg_trs_psiq_cie_10_or 2,357 0.6093802 FALSE 12 Tra: 1288, En : 1210, Tra: 532, Tra: 273 NA NA NA NA NA NA NA NA NA NA
factor x2_dg_trs_psiq_cie_10_or 5,797 0.0392774 FALSE 11 En : 86, Tra: 55, Tra: 39, Tra: 33 NA NA NA NA NA NA NA NA NA NA
factor x3_dg_trs_psiq_cie_10_or 6,007 0.0044746 FALSE 9 Tra: 9, En : 7, Tra: 5, Ret: 1 NA NA NA NA NA NA NA NA NA NA
factor x4_dg_trs_psiq_cie_10_or 6,032 0.0003315 FALSE 2 Tra: 1, Tra: 1, En : 0, Ret: 0 NA NA NA NA NA NA NA NA NA NA
factor x5_dg_trs_psiq_cie_10_or 6,034 0.0000000 FALSE 0 Tra: 0 NA NA NA NA NA NA NA NA NA NA
factor alta_adm 0 1.0000000 FALSE 2 0: 5483, 1: 551 NA NA NA NA NA NA NA NA NA NA
factor dg_cie_10_rec 0 1.0000000 FALSE 2 1: 3677, 0: 2357 NA NA NA NA NA NA NA NA NA NA
logical hurto 0 1.0000000 NA NA NA 0.1093802 FAL: 5374, TRU: 660 NA NA NA NA NA NA NA NA
logical robo 0 1.0000000 NA NA NA 0.0634736 FAL: 5651, TRU: 383 NA NA NA NA NA NA NA NA
logical venta_drogas 0 1.0000000 NA NA NA 0.0447464 FAL: 5764, TRU: 270 NA NA NA NA NA NA NA NA
logical rina 0 1.0000000 NA NA NA 0.1277759 FAL: 5263, TRU: 771 NA NA NA NA NA NA NA NA
logical otro 0 1.0000000 NA NA NA 0.0331455 FAL: 5834, TRU: 200 NA NA NA NA NA NA NA NA
logical lugar_vivir 0 1.0000000 NA NA NA 0.9191250 TRU: 5546, FAL: 488 NA NA NA NA NA NA NA NA
logical vivienda 0 1.0000000 NA NA NA 0.9212794 TRU: 5559, FAL: 475 NA NA NA NA NA NA NA NA
logical vif 0 1.0000000 NA NA NA 0.2053364 FAL: 4795, TRU: 1239 NA NA NA NA NA NA NA NA
logical na.rm 0 1.0000000 NA NA NA 1.0000000 TRU: 6034 NA NA NA NA NA NA NA NA
logical transg_norm_bin 0 1.0000000 NA NA NA 0.3619490 FAL: 3850, TRU: 2184 NA NA NA NA NA NA NA NA
logical menor_vivienda 0 1.0000000 NA NA NA 0.1005966 FAL: 5427, TRU: 607 NA NA NA NA NA NA NA NA
logical menor_salud 0 1.0000000 NA NA NA 0.4446470 FAL: 3351, TRU: 2683 NA NA NA NA NA NA NA NA
logical menor_trabajo_edu 0 1.0000000 NA NA NA 0.5024859 TRU: 3032, FAL: 3002 NA NA NA NA NA NA NA NA
logical mayor_total 0 1.0000000 NA NA NA 0.5122638 TRU: 3091, FAL: 2943 NA NA NA NA NA NA NA NA
logical mayor_dosis 0 1.0000000 NA NA NA 0.5046404 TRU: 3045, FAL: 2989 NA NA NA NA NA NA NA NA
logical polycons 0 1.0000000 NA NA NA 0.8967517 TRU: 5411, FAL: 623 NA NA NA NA NA NA NA NA
logical mayor_comp_biopsicsoc 0 1.0000000 NA NA NA 0.3163739 FAL: 4125, TRU: 1909 NA NA NA NA NA NA NA NA
logical menor_comp_biopsicsoc 0 1.0000000 NA NA NA 0.1234670 FAL: 5289, TRU: 745 NA NA NA NA NA NA NA NA
numeric total_oh 0 1.0000000 NA NA NA NA NA 7.639211e+00 8.787051e+00 0.0000000 0.0000000 4.000000e+00 1.200000e+01 2.800000e+01 <U+2587><U+2582><U+2582><U+2581><U+2582>
numeric dosis_oh 0 1.0000000 NA NA NA NA NA 8.529997e-01 2.115613e+00 0.0000000 0.0000000 0.000000e+00 1.000000e+00 3.900000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_thc 0 1.0000000 NA NA NA NA NA 4.773782e+00 9.057474e+00 0.0000000 0.0000000 0.000000e+00 4.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric dosis_thc 0 1.0000000 NA NA NA NA NA 8.529997e-01 2.115613e+00 0.0000000 0.0000000 0.000000e+00 1.000000e+00 3.900000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_pbc 0 1.0000000 NA NA NA NA NA 5.217766e+00 9.243989e+00 0.0000000 0.0000000 0.000000e+00 7.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric dosis_pbc 6 0.9990056 NA NA NA NA NA 1.197412e+00 3.065544e+00 0.0000000 0.0000000 0.000000e+00 1.000000e+00 4.000000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_coc 0 1.0000000 NA NA NA NA NA 2.750249e+00 6.207776e+00 0.0000000 0.0000000 0.000000e+00 2.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric dosis_coc 2 0.9996685 NA NA NA NA NA 2.780172e-01 1.804247e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 4.000000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_bzd 0 1.0000000 NA NA NA NA NA 8.168711e-01 4.079368e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric dosis_bzd 3 0.9995028 NA NA NA NA NA 4.209915e-01 2.703603e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 4.000000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_otra 0 1.0000000 NA NA NA NA NA 9.479616e-01 4.865021e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric dosis_otra 3 0.9995028 NA NA NA NA NA 4.209915e-01 2.703603e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 4.000000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_vif 0 1.0000000 NA NA NA NA NA 1.370567e+00 4.525512e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_transgresion 0 1.0000000 NA NA NA NA NA 3.785217e-01 7.973369e-01 0.0000000 0.0000000 0.000000e+00 0.000000e+00 5.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric salud_psicologica 0 1.0000000 NA NA NA NA NA 9.162081e+00 4.626610e+00 0.0000000 5.0000000 1.000000e+01 1.200000e+01 2.000000e+01 <U+2583><U+2586><U+2587><U+2583><U+2581>
numeric total_trabajo 0 1.0000000 NA NA NA NA NA 1.087720e+01 1.053146e+01 0.0000000 0.0000000 1.000000e+01 2.000000e+01 2.800000e+01 <U+2587><U+2581><U+2582><U+2583><U+2583>
numeric total_educacion 0 1.0000000 NA NA NA NA NA 3.215114e-01 2.302510e+00 0.0000000 0.0000000 0.000000e+00 0.000000e+00 2.800000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric salud_fisica 0 1.0000000 NA NA NA NA NA 1.109032e+01 4.969154e+00 0.0000000 8.0000000 1.000000e+01 1.500000e+01 2.000000e+01 <U+2582><U+2585><U+2587><U+2586><U+2583>
numeric calidad_vida 0 1.0000000 NA NA NA NA NA 1.035648e+01 5.104254e+00 0.0000000 7.0000000 1.000000e+01 1.500000e+01 2.000000e+01 <U+2583><U+2585><U+2587><U+2585><U+2583>
numeric sin_otros_prob_sm 0 1.0000000 NA NA NA NA NA 4.033808e-01 4.906166e-01 0.0000000 0.0000000 0.000000e+00 1.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2586>
numeric dg_cons_dep 0 1.0000000 NA NA NA NA NA 7.046735e-01 4.562272e-01 0.0000000 0.0000000 1.000000e+00 1.000000e+00 1.000000e+00 <U+2583><U+2581><U+2581><U+2581><U+2587>
numeric cnt_mod_cie_10_or 0 1.0000000 NA NA NA NA NA 1.044083e+00 2.289628e-01 1.0000000 1.0000000 1.000000e+00 1.000000e+00 4.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric row 0 1.0000000 NA NA NA NA NA 1.124796e+05 2.002838e+04 74,802.0000000 95,657.2500000 1.123425e+05 1.283335e+05 1.618310e+05 <U+2586><U+2587><U+2587><U+2586><U+2582>
numeric transg_norm 0 1.0000000 NA NA NA NA NA 5.838581e-01 9.605226e-01 0.0000000 0.0000000 0.000000e+00 1.000000e+00 6.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric salud_mean 0 1.0000000 NA NA NA NA NA 1.020296e+01 4.106428e+00 0.0000000 7.3333333 1.000000e+01 1.333333e+01 2.000000e+01 <U+2582><U+2585><U+2587><U+2586><U+2582>
numeric dosis_mean 11 0.9981770 NA NA NA NA NA 6.685207e-01 1.371863e+00 0.0000000 0.0000000 1.666667e-01 8.333333e-01 1.900000e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_mean 0 1.0000000 NA NA NA NA NA 3.690973e+00 3.363779e+00 0.0000000 1.0000000 3.000000e+00 5.333333e+00 2.333333e+01 <U+2587><U+2583><U+2581><U+2581><U+2581>
numeric vivienda_mean 0 1.0000000 NA NA NA NA NA 4.601011e+00 1.255400e+00 0.0000000 5.0000000 5.000000e+00 5.000000e+00 5.000000e+00 <U+2581><U+2581><U+2581><U+2581><U+2587>
numeric trabajo_edu_mean 0 1.0000000 NA NA NA NA NA 5.599354e+00 5.327197e+00 0.0000000 0.0000000 5.000000e+00 1.000000e+01 2.400000e+01 <U+2587><U+2582><U+2586><U+2581><U+2581>
numeric salud_mean_z 0 1.0000000 NA NA NA NA NA -1.158300e-02 9.970431e-01 -2.4888676 -0.7083302 -6.086200e-02 7.484732e-01 2.367144e+00 <U+2582><U+2585><U+2587><U+2586><U+2582>
numeric dosis_mean_z 11 0.9981770 NA NA NA NA NA -2.417700e-03 9.878153e-01 -0.4837886 -0.4837886 -3.637796e-01 1.162564e-01 1.319724e+01 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_mean_z 0 1.0000000 NA NA NA NA NA 1.238810e-02 1.002631e+00 -1.0877693 -0.7897023 -1.935682e-01 5.019215e-01 5.867128e+00 <U+2587><U+2582><U+2581><U+2581><U+2581>
numeric vivienda_mean_z 0 1.0000000 NA NA NA NA NA -8.369900e-03 1.014088e+00 -3.7249801 0.3139260 3.139260e-01 3.139260e-01 3.139260e-01 <U+2581><U+2581><U+2581><U+2581><U+2587>
numeric total_vif_z 0 1.0000000 NA NA NA NA NA 5.596800e-03 1.006895e+00 -0.2993448 -0.2993448 -2.993448e-01 -2.993448e-01 5.930462e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric total_trabajo_edu_z 0 1.0000000 NA NA NA NA NA -6.085400e-03 1.000222e+00 -1.0574064 -1.0574064 -1.186186e-01 8.201692e-01 3.448775e+00 <U+2587><U+2582><U+2586><U+2581><U+2581>
numeric oh 0 1.0000000 NA NA NA NA NA 7.250580e-01 4.465220e-01 0.0000000 0.0000000 1.000000e+00 1.000000e+00 1.000000e+00 <U+2583><U+2581><U+2581><U+2581><U+2587>
numeric thc 0 1.0000000 NA NA NA NA NA 3.457077e-01 4.756379e-01 0.0000000 0.0000000 0.000000e+00 1.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2585>
numeric pbc 0 1.0000000 NA NA NA NA NA 3.543255e-01 4.783481e-01 0.0000000 0.0000000 0.000000e+00 1.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2585>
numeric coc 0 1.0000000 NA NA NA NA NA 3.099105e-01 4.624948e-01 0.0000000 0.0000000 0.000000e+00 1.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2583>
numeric bzd 0 1.0000000 NA NA NA NA NA 6.629100e-02 2.488107e-01 0.0000000 0.0000000 0.000000e+00 0.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
numeric otra 0 1.0000000 NA NA NA NA NA 4.408350e-02 2.052977e-01 0.0000000 0.0000000 0.000000e+00 0.000000e+00 1.000000e+00 <U+2587><U+2581><U+2581><U+2581><U+2581>
#knitr::include_graphics("G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/SUD_CL/Figures/Figure_Duplicates.svg")
#foreign::write.dta(CONS_C1_df_dup_JUL_2020_match_top_sel, "G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/c1_top.dta")

#CONS_C1_df_dup_JUL_2020 CONS_C1_df_dup_JUL_2020_match_top CONS_TOP_df_dup_ENE_2020_prev9_match CONS_TOP_df_dup_ENE_2020_prev9
#sólo match
comb_datasets_a_n<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(hash_key_TOP))%>%
    dplyr::select(-hash_key_TOP)%>% 
    nrow()
comb_datasets_a_users<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(hash_key_TOP))%>%
    dplyr::select(-hash_key_TOP)%>% 
    dplyr::distinct(hash_key)%>% nrow()
#sólo los que tienen compromiso biopsicosocial
comb_datasets_b_n<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(compromiso_biopsicosocial), !is.na(hash_key_TOP))%>%
    dplyr::select(-hash_key_TOP)%>% 
    nrow()
comb_datasets_b_users<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(compromiso_biopsicosocial), !is.na(hash_key_TOP))%>%
    dplyr::select(-hash_key_TOP)%>% 
    dplyr::distinct(hash_key)%>% nrow()
#sólo los que tienen etapa de tratamiento de inicio de tratamiento
comb_datasets_c_n<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(compromiso_biopsicosocial), !is.na(hash_key_TOP), etapa_del_tratamiento=="Inicio Tratamiento")%>%
    dplyr::select(-hash_key_TOP)%>% 
    nrow()
comb_datasets_c_users<-CONS_C1_df_dup_JUL_2020%>%
    dplyr::mutate(concat=paste0(hash_key,"_",as.character(fech_ing)))%>%
    dplyr::left_join(CONS_TOP_df_dup_ENE_2020_prev9_match,by="concat", suffix=c("","_TOP"))%>%
    dplyr::filter(!is.na(compromiso_biopsicosocial), !is.na(hash_key_TOP), etapa_del_tratamiento=="Inicio Tratamiento")%>%
    dplyr::select(-hash_key_TOP)%>% 
    dplyr::distinct(hash_key)%>% nrow()
#sólo los que tienen tratamientos finalizados (no en curso)

#  dplyr::filter(motivodeegreso_mod_imp!="En curso")%>% #Sacar los tratamientos que estén en curso 

#https://stackoverflow.com/questions/46750364/diagrammer-and-graphviz
#https://mikeyharper.uk/flowcharts-in-r-using-diagrammer/
#http://blog.nguyenvq.com/blog/2012/05/29/better-decision-tree-graphics-for-rpart-via-party-and-partykit/
#http://blog.nguyenvq.com/blog/2014/01/17/skeleton-to-create-fast-automatic-tree-diagrams-using-r-and-graphviz/
#https://cran.r-project.org/web/packages/DiagrammeR/vignettes/graphviz-mermaid.html
#https://stackoverflow.com/questions/39133058/how-to-use-graphviz-graphs-in-diagrammer-for-r
#https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781789802566/1/ch01lvl1sec21/creating-diagrams-via-the-diagrammer-package
#https://justlegal.be/2019/05/using-flowcharts-to-display-legal-procedures/
#
#
library(DiagrammeR) #⋉
grViz("digraph flowchart {
      # node definitions with substituted label text
      node [fontname = Helvetica, shape = rectangle,fontsize = 10]        
      tab1 [label = '@@1']
      tab2 [label = '@@2']
      tab3 [label = '@@3']
      tab4 [label = '@@4']
      tab5 [label = '@@5']
      tab6 [label = '@@6']
      tab7 [label = '@@7']
      tab8 [label = '@@8']
      blank [label = '', width = 0.001, height = 0.001]

      # edge definitions with the node IDs
      tab1 -> tab2;
      blank -> tab3[label='',dir = none, shape=none, color = 'white',fontcolor = black, width=0, height=0];
      {rank=same; tab2 -> tab3 [label='Left join',fontsize = 11, dir=none]}; #⋉
      tab3 -> tab4 [label='  Result of the left join of the dataset',fontsize = 9];
      tab4 -> tab5 [label='  Only rows with available data on Biopsychosocial Compromise',fontsize = 9];
      tab5 -> tab6 [label='  Only strict matches with applications at the stage of admission',fontsize = 9];
      tab6 -> tab7 [label='  Only rows with available data on TOP scores and Diagnostic of CIE-10',fontsize = 9];
      tab7 -> tab8 [label='  Only rows with finished treatments',fontsize = 9];
      }
Only rows with available data on TOP scores

      [1]:  paste0('TOP Dataset \\n(n = ', formatC(nrow(CONS_TOP_df_dup_ENE_2020_prev9), format='f', big.mark=',', digits=0), ';\\n users: ',formatC(CONS_TOP_df_dup_ENE_2020_prev9%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
      [2]:  paste0('Only applications w/ only one\\n application in the same date \\n(n = ', formatC(nrow(CONS_TOP_df_dup_ENE_2020_prev9_match), format='f', big.mark=',', digits=0), ';\\n users:',formatC(users_top_only_one_app_match, format='f', big.mark=',', digits=0),')')
      [3]:  paste0('Only applications w/ only one\\n application in the same date \\n(n = ', formatC(nrow(CONS_C1_df_dup_JUL_2020), format='f', big.mark=',', digits=0), ';\\n users:',formatC(CONS_C1_df_dup_JUL_2020%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
      [4]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_a_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_a_users,format='f', big.mark=',', digits=0),')')
      [5]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_b_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_b_users,format='f', big.mark=',', digits=0),')')
      [6]:  paste0('Dataset \\n(n = ',formatC(comb_datasets_c_n,format='f', big.mark=',', digits=0),'\\nusers =',formatC(comb_datasets_c_users,format='f', big.mark=',', digits=0),')')
      [7]:  paste0('Dataset \\n(n = ', formatC(nrow(CONS_C1_df_dup_JUL_2020_match_top_sel_prev), format='f', big.mark=',', digits=0), ';\\n users: ',formatC(CONS_C1_df_dup_JUL_2020_match_top_sel_prev%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
      [8]:  paste0('Final Sample \\n(n = ', formatC(nrow(CONS_C1_df_dup_JUL_2020_match_top_sel), format='f', big.mark=',', digits=0), ';\\n users: ',formatC(CONS_C1_df_dup_JUL_2020_match_top_sel%>% dplyr::distinct(hash_key)%>% nrow(), format='f', big.mark=',', digits=0),')')
      ")


sdasd


#Definir la base de datos
CONS_C1_df_dup_JUL_2020_explore<-
    CONS_C1_df_dup_JUL_2020%>%
      dplyr::mutate(treat= ifelse(row %in% unlist(CONS_C1_df_dup_JUL_2020_match_top_sel$row),1,0))%>%
  dplyr::mutate(abandono_temp=dplyr::case_when(motivodeegreso_mod_imp=="Abandono Temprano"~1,TRUE~0),
                alta_adm= dplyr::case_when(grepl("trativa",as.character(motivodeegreso_mod_imp))~1,TRUE~0))%>%
  dplyr::mutate(abandono_temp= as.factor(abandono_temp),alta_adm=as.factor(alta_adm),
                origen_ingreso_mod=as.factor(origen_ingreso_mod))%>%
    dplyr::mutate(dg_trs_fisico= dplyr::case_when(grepl("En estudio",as.character(diagnostico_trs_fisico))~1,
                                                  grepl("Sin trastorno",as.character(diagnostico_trs_fisico))~0,
                                                  is.na(as.character(diagnostico_trs_fisico))~0,
                                                  TRUE~1))%>%
  dplyr::mutate(dg_cons_dep= dplyr::case_when(grepl("Dependencia",as.character(dg_trs_cons_sus_or))~1,
                                              TRUE~0))%>%
  dplyr::mutate(sin_otros_prob_sm= dplyr::case_when(grepl("Sin otros problemas de salud mental",as.character(otros_probl_at_sm_or), ignore.case=T)~1,
                                                    TRUE~0))%>%
  dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.))~0,grepl("Sin trastorno",as.character(.))~0,is.na(.)~0,TRUE~1), .names="dg_{col}"))%>%
    dplyr::mutate(across(c(dg_trs_psiq_cie_10_or,x2_dg_trs_psiq_cie_10_or,x3_dg_trs_psiq_cie_10_or,x4_dg_trs_psiq_cie_10_or,x5_dg_trs_psiq_cie_10_or),~dplyr::case_when(grepl("En estudio",as.character(.))~1,grepl("Sin trastorno",as.character(.))~0,is.na(.)~0,TRUE~0), .names="instudy_{col}"))%>%
    dplyr::mutate(instudy_total_cie_10 = base::rowSums(dplyr::select(., instudy_dg_trs_psiq_cie_10_or, instudy_x2_dg_trs_psiq_cie_10_or, instudy_x3_dg_trs_psiq_cie_10_or, instudy_x4_dg_trs_psiq_cie_10_or, instudy_x5_dg_trs_psiq_cie_10_or)))%>%
      dplyr::mutate(dg_cie_10_rec=dplyr::case_when(grepl("Diagnosti",cie_10)~1,
                                             grepl("En estudio",cie_10)~1,
                                             TRUE~0))%>%
    dplyr::mutate(dg_cie_10_rec=as.factor(dg_cie_10_rec))%>%
  dplyr::mutate(dg_cie_10_rec=factor(dg_cie_10_rec,labels=c('No Disorder','In Study or Psychiatric Disorder')))%>%
  dplyr::mutate(compromiso_biopsicosocial=factor(compromiso_biopsicosocial,labels=c('1-Mild','2-Moderate','3-Severe')))%>%
  dplyr::mutate(compromiso_biopsicosocial=factor(origen_ingreso_mod,labels=c('Spontaneous','Assisted Referral','Other','Justice Sector','Health Sector')))%>%
  dplyr::mutate(dg_trs_fisico=factor(dg_trs_fisico,labels=c('No Disorder','In Study or Physical Disorder')))%>%
  dplyr::mutate(abandono_temp=factor(abandono_temp,labels=c('Other causes of discharge','Early Drop-out')))%>%
  dplyr::mutate(dg_cons_dep=factor(dg_cons_dep,labels=c('No Drug Dependence','Drug Dependence')))%>%
  dplyr::mutate(sin_otros_prob_sm=factor(sin_otros_prob_sm,labels=c('Other Mental Health Related Problems','No other mental health related problems')))%>%
  dplyr::filter(!grepl("En curso", as.character(motivodeegreso_mod_imp), ignore.case = T))

label(CONS_C1_df_dup_JUL_2020_explore$dg_cie_10_rec) <- "Psychiatric Condition (ICD-10)"
label(CONS_C1_df_dup_JUL_2020_explore$sin_otros_prob_sm) <- "Other health-related problems"
label(CONS_C1_df_dup_JUL_2020_explore$dg_trs_fisico) <- "Physical Condition"

label(CONS_C1_df_dup_JUL_2020_explore$dg_cons_dep) <- "Drug Dependence"
label(CONS_C1_df_dup_JUL_2020_explore$abandono_temp) <- "Early Drop-Out"
label(CONS_C1_df_dup_JUL_2020_explore$compromiso_biopsicosocial) <- "Compromiso Biopsicosocial(d)/Biopsychosocial Involvement(d)"
label(CONS_C1_df_dup_JUL_2020_explore$origen_ingreso_mod) <-"Origen de Ingreso (Primera Entrada)/Motive of Admission to Treatment (First Entry)"
CONS_C1_df_dup_JUL_2020_concat <- dplyr::mutate(CONS_C1_df_dup_JUL_2020,concat=paste0(hash_key,"_",as.character(fech_ing)))

#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_#_
library(compareGroups)
 
table <- compareGroups(treat ~ compromiso_biopsicosocial+ origen_ingreso_mod+ abandono_temp+ dg_trs_fisico+ dg_cons_dep+ sin_otros_prob_sm+ dg_cie_10_rec, data = CONS_C1_df_dup_JUL_2020_explore,include.miss = T) #abandono_temp+ dg_cons_dep+ sin_otros_prob_sm dg_trs_fisico dg_cie_10+ alta_adm+
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'compromiso_biopsicosocial' are removed since no observation in
that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'origen_ingreso_mod' are removed since no observation in that/
those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'abandono_temp' are removed since no observation in that/those
levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'dg_trs_fisico' are removed since no observation in that/those
levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'dg_cons_dep' are removed since no observation in that/those
levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'sin_otros_prob_sm' are removed since no observation in that/
those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'dg_cie_10_rec' are removed since no observation in that/those
levels
pvals <- getResults(table)
#p.adjust(pvals, method = "BH")
restab <- createTable(table)
 export2md(restab, size=9,header.labels = c(p.overall = "p-value"), format="html",position="center",caption= "Table 2a. Summary descriptives table by groups, between selected rows (=1) and records that did not match (=0)")%>%
  kableExtra::add_footnote(c(paste0("Note. Variables related to C1 dataset. Records of treatments not finished were omitted (n=",CONS_C1_df_dup_JUL_2020%>% dplyr::filter(grepl("En curso", as.character(motivodeegreso_mod_imp), ignore.case = T))%>% nrow(),")")), notation = "none") %>%
  kableExtra::scroll_box(width = "100%", height = "375px")
Table 2a. Summary descriptives table by groups, between selected rows (=1) and records that did not match (=0)
0 1 p-value
N=95876 N=6034
Compromiso Biopsicosocial(d)/Biopsychosocial Involvement(d): <0.001
Spontaneous 43404 (45.3%) 3238 (53.7%)
Assisted Referral 10601 (11.1%) 527 (8.73%)
Other 4946 (5.16%) 281 (4.66%)
Justice Sector 8467 (8.83%) 486 (8.05%)
Health Sector 28458 (29.7%) 1502 (24.9%)
Origen de Ingreso (Primera Entrada)/Motive of Admission to Treatment (First Entry): <0.001
Consulta Espontánea 43404 (45.3%) 3238 (53.7%)
Derivación Asistida 10601 (11.1%) 527 (8.73%)
Otros 4946 (5.16%) 281 (4.66%)
Sector Justicia 8467 (8.83%) 486 (8.05%)
Sector Salud 28458 (29.7%) 1502 (24.9%)
Early Drop-Out: <0.001
Other causes of discharge 79105 (82.5%) 4855 (80.5%)
Early Drop-out 16771 (17.5%) 1179 (19.5%)
Physical Condition: <0.001
No Disorder 34879 (36.4%) 1950 (32.3%)
In Study or Physical Disorder 60997 (63.6%) 4084 (67.7%)
Drug Dependence: <0.001
No Drug Dependence 24025 (25.1%) 1782 (29.5%)
Drug Dependence 71851 (74.9%) 4252 (70.5%)
Other health-related problems: 0.020
Other Mental Health Related Problems 58648 (61.2%) 3600 (59.7%)
No other mental health related problems 37228 (38.8%) 2434 (40.3%)
Psychiatric Condition (ICD-10): <0.001
No Disorder 34632 (36.1%) 2357 (39.1%)
In Study or Psychiatric Disorder 61244 (63.9%) 3677 (60.9%)
Note. Variables related to C1 dataset. Records of treatments not finished were omitted (n=7846)


As seen in the table above, the treatments that did not matched with TOP datasets (users in treatment in where the TOP was not administered at the time of admission) may have slightly different characteristics than the sample with complete TOPs, excepting in the percentage of people that may had other health-related problems. This is relevant because this last variable is closely related to information assessed in the subsection of TOP named as transgression to norms.


vars_to_check <- c("hurto", "robo",  "venta_drogas", "rina",  "otro", "lugar_vivir", "vivienda", "salud_fisica", "calidad_vida", "salud_psicologica", "dosis_oh", "dosis_thc", "dosis_pbc", "dosis_coc", "dosis_bzd", "dosis_otra", "total_oh", "total_thc", "total_pbc", "total_coc", "total_bzd", "total_otra", "total_vif")

CONS_TOP_df_dup_ENE_2020_prev9_match_explore<-
CONS_TOP_df_dup_ENE_2020_prev9_match%>%
    dplyr::left_join(CONS_C1_df_dup_JUL_2020_concat,by="concat", suffix=c("","_c1"))%>%
  dplyr::ungroup()%>%
  dplyr::filter(across(one_of(vars_to_check),
                ~ !is.na(.x)))%>%
    dplyr::mutate(treat=ifelse(!is.na(hash_key_c1),1,0))%>%
  dplyr::mutate(dosis_oh=ifelse(dosis_oh>=50, NA,dosis_oh))%>%
  dplyr::mutate(dosis_oh=ifelse(dosis_thc>=50, NA,dosis_thc))%>%
  dplyr::mutate(dosis_pbc=ifelse(dosis_pbc>=50, NA,dosis_pbc))%>%
  dplyr::mutate(dosis_coc=ifelse(dosis_coc>=50, NA,dosis_coc))%>%
  dplyr::mutate(dosis_bzd=ifelse(dosis_bzd>=50, NA,dosis_bzd))%>%
  dplyr::mutate(dosis_otra=ifelse(dosis_otra>=50, NA,dosis_otra))%>%
  dplyr::mutate_at(vars(hurto, robo,  venta_drogas, rina,  otro, lugar_vivir, vivienda), ~as.numeric(.)-1)%>%
  dplyr::mutate(vif=ifelse(total_vif>0,1,0))%>%
  dplyr::mutate(transg_norm = base::rowSums(dplyr::select(., hurto,robo,venta_drogas,rina,otro,vif), na.rm=T))%>%
  dplyr::mutate(vif=as.logical(vif))%>%
  dplyr::mutate(salud_mean = base::rowSums(dplyr::select(., salud_fisica, calidad_vida, salud_psicologica), na.rm=T)/3)%>%
  dplyr::mutate(dosis_mean = base::rowSums(dplyr::select(., dosis_oh, dosis_thc, dosis_pbc, dosis_coc, dosis_bzd, dosis_otra), na.rm=T)/6)%>%
  dplyr::mutate(total_mean = base::rowSums(dplyr::select(., total_oh, total_thc, total_pbc, total_coc, total_bzd, total_otra), na.rm=T)/6)%>%
  dplyr::mutate(vivienda_mean = (base::rowSums(dplyr::select(.,lugar_vivir, vivienda), na.rm=T)/2)*5)%>%
  dplyr::mutate(trabajo_edu_mean = base::rowSums(dplyr::select(., total_trabajo, total_educacion), na.rm=T)/2)%>%
  dplyr::mutate_at(vars(hurto, robo,  venta_drogas, rina,  otro, lugar_vivir, vivienda), ~as.logical(.))%>%
  dplyr::mutate(salud_mean_z=scale2(salud_mean,na.rm=T))%>%
  dplyr::mutate(dosis_mean_z=scale2(dosis_mean,na.rm=T))%>%
  dplyr::mutate(total_mean_z=scale2(total_mean,na.rm=T))%>%
  dplyr::mutate(vivienda_mean_z=scale2(vivienda_mean,na.rm=T))%>%
  dplyr::mutate(total_vif_z=scale2(total_vif,na.rm=T))%>%
  dplyr::mutate(total_trabajo_edu_z=scale2(trabajo_edu_mean,na.rm=T))%>%
  #dicotomizar variables
  dplyr::mutate(oh=dplyr::case_when(total_oh>0~1,TRUE~0),
                thc=dplyr::case_when(total_thc>0~1,TRUE~0),
                pbc=dplyr::case_when(total_pbc>0~1,TRUE~0),
                coc=dplyr::case_when(total_coc>0~1,TRUE~0),
                bzd=dplyr::case_when(total_bzd>0~1,TRUE~0),
                otra=dplyr::case_when(total_otra>0~1,TRUE~0))%>%
  dplyr::mutate(transg_norm_bin=as.logical(dplyr::case_when(transg_norm>0~1,TRUE~0)))%>%
  dplyr::mutate(median_vivienda_mean=quantile(vivienda_mean,p=.5,na.rm=T))%>%
  dplyr::mutate(median_total_mean=quantile(total_mean,p=.5,na.rm=T))%>%
  dplyr::mutate(median_dosis_mean=quantile(dosis_mean,p=.5,na.rm=T))%>%
  dplyr::mutate(median_salud_mean=quantile(salud_mean,p=.5,na.rm=T))%>%
  dplyr::mutate(median_trabajo_edu_mean=quantile(trabajo_edu_mean,p=.5,na.rm=T))%>%
  dplyr::mutate(menor_vivienda=dplyr::case_when(vivienda_mean<median_vivienda_mean~1,TRUE~0))%>%
  dplyr::mutate(menor_salud=dplyr::case_when(salud_mean<median_salud_mean~1,TRUE~0))%>%
  dplyr::mutate(menor_trabajo_edu=dplyr::case_when(trabajo_edu_mean<median_trabajo_edu_mean~1,TRUE~0))%>%
  dplyr::mutate(mayor_total=dplyr::case_when(total_mean>=median_total_mean~1,TRUE~0))%>%
  dplyr::mutate(mayor_dosis=dplyr::case_when(median_dosis_mean>=dosis_mean~1,TRUE~0))%>%
  dplyr::mutate_at(vars(menor_salud, menor_trabajo_edu, mayor_total, mayor_dosis), ~as.logical(.))%>%
  dplyr::mutate(polycons = base::rowSums(dplyr::select(., oh, thc, pbc, coc, bzd, otra)),
                polycons =as.logical(ifelse(polycons>0,1,0)))%>%
  dplyr::mutate(mayor_dosis=dplyr::case_when(median_dosis_mean>=dosis_mean~1,TRUE~0))%>%
  dplyr::mutate(mayor_dosis=as.logical(mayor_dosis))%>%
  dplyr::mutate(menor_vivienda=as.logical(menor_vivienda))

label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$salud_mean_z) <- "Promedio ptjes. est. de estado de salud autorreportado/Average std. score of self-reported health"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$dosis_mean_z) <- "Promedio ptjes. est. de dosis media/Average std. score of average dose"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$total_mean_z) <- "Promedio ptjes. est. de patrones de uso de sus. principales/Average std. median score of patterns of drug use in main substances"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$total_trabajo_edu_z) <- "Promedio est. de días trabajados o asistidos a una institución educacional/Average std. days worked and attended an educational institution"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$total_vif_z) <- "Ptjes. est. del Total en Violencia Intrafamiliar/std. score in Domestic Violence"

label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$hurto) <- "Hurto/Theft (Transgression to Norms in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$robo) <- "Robo/Robbery (Transgression to Norms in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$venta_drogas) <- "Venta de Drogas/Drug-selling (Transgression to Norms in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$rina) <- "Riña/Fights (Transgression to Norms in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$otro) <- "Otras Transgresiones a la Norma/Other transgression to norms (Transgression to Norms in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$lugar_vivir) <- "Lugar Estable para Vivir/Stable Place to Live in the past 4 weeks (Housing conditions in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$vivienda) <- "Habita en una vivienda con las condiciones básicas/Lived in a Houshold with basic conditions in the past 4 weeks (Housing conditions in past 4 weeks)"
label(CONS_TOP_df_dup_ENE_2020_prev9_match_explore$vif) <- "Violencia Intrafamiliar Reportada en las últimas 4 semanas/Reported Domestic Violence Events in the past 4 weeks (Transgression to Norms)"

table <- compareGroups(treat ~salud_mean_z+ dosis_mean_z+ total_mean_z+ total_trabajo_edu_z+ hurto+ robo+ venta_drogas+ rina+ otro+ lugar_vivir+vif+ vivienda,
                       data = CONS_TOP_df_dup_ENE_2020_prev9_match_explore,include.miss = T,var.equal=T,method=c(2,2,2,2,3,3,3,3,3,3,3,3))
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'hurto' are removed since no observation in that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'robo' are removed since no observation in that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'venta_drogas' are removed since no observation in that/those
levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'rina' are removed since no observation in that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'otro' are removed since no observation in that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'lugar_vivir' are removed since no observation in that/those
levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'vif' are removed since no observation in that/those levels
Warning in compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i], :
Some levels of 'vivienda' are removed since no observation in that/those levels
pvals <- getResults(table, "p.overall")

#p.adjust(pvals, method = "BH")
restab <- createTable(table)
 export2md(restab, header.labels = c(p.overall = "p-value"), size=9,format="html",position="center",caption= "Table 2b. Summary descriptives table by groups, between selected rows (=1) and records that did not match (=0)")%>%
  kableExtra::add_footnote(c("Note. Variables related to TOP dataset.", paste0("We discarded cases with missing values in one of the variables (n=", CONS_TOP_df_dup_ENE_2020_prev9_match%>%dplyr::left_join(CONS_C1_df_dup_JUL_2020_concat,by="concat", suffix=c("","_c1"))%>%dplyr::ungroup()%>%dplyr::filter(across(one_of(vars_to_check),~ is.na(.x)))%>% nrow(),")")), notation = "none") %>%
  kableExtra::scroll_box(width = "100%", height = "375px")
Table 2b. Summary descriptives table by groups, between selected rows (=1) and records that did not match (=0)
0 1 p-value
N=86400 N=6507
Promedio ptjes. est. de estado de salud autorreportado/Average std. score of self-reported health 0.11 [-0.68;0.82] -0.68 [-1.31;0.11] 0.000
Promedio ptjes. est. de dosis media/Average std. score of average dose -0.36 [-0.36;-0.07] -0.21 [-0.36;0.22] <0.001
Promedio ptjes. est. de patrones de uso de sus. principales/Average std. median score of patterns of drug use in main substances -0.54 [-0.66;0.23] 0.35 [-0.36;1.24] 0.000
Promedio est. de días trabajados o asistidos a una institución educacional/Average std. days worked and attended an educational institution -0.10 [-1.02;0.82] -0.10 [-1.02;0.82] 0.103
Hurto/Theft (Transgression to Norms in past 4 weeks): <0.001
FALSE 82209 (95.1%) 5812 (89.3%)
TRUE 4191 (4.85%) 695 (10.7%)
Robo/Robbery (Transgression to Norms in past 4 weeks): <0.001
FALSE 83979 (97.2%) 6108 (93.9%)
TRUE 2421 (2.80%) 399 (6.13%)
Venta de Drogas/Drug-selling (Transgression to Norms in past 4 weeks): <0.001
FALSE 84599 (97.9%) 6217 (95.5%)
TRUE 1801 (2.08%) 290 (4.46%)
Riña/Fights (Transgression to Norms in past 4 weeks): <0.001
FALSE 81307 (94.1%) 5691 (87.5%)
TRUE 5093 (5.89%) 816 (12.5%)
Otras Transgresiones a la Norma/Other transgression to norms (Transgression to Norms in past 4 weeks): <0.001
FALSE 84154 (97.4%) 6284 (96.6%)
TRUE 2246 (2.60%) 223 (3.43%)
Lugar Estable para Vivir/Stable Place to Live in the past 4 weeks (Housing conditions in past 4 weeks): <0.001
FALSE 5627 (6.51%) 519 (7.98%)
TRUE 80773 (93.5%) 5988 (92.0%)
Violencia Intrafamiliar Reportada en las últimas 4 semanas/Reported Domestic Violence Events in the past 4 weeks (Transgression to Norms): <0.001
FALSE 77268 (89.4%) 5192 (79.8%)
TRUE 9132 (10.6%) 1315 (20.2%)
Habita en una vivienda con las condiciones básicas/Lived in a Houshold with basic conditions in the past 4 weeks (Housing conditions in past 4 weeks): 0.072
FALSE 6195 (7.17%) 506 (7.78%)
TRUE 80205 (92.8%) 6001 (92.2%)
Note. Variables related to TOP dataset.
We discarded cases with missing values in one of the variables (n=244)


In the Table above and despite we were focused in the treatments of the Agreement 1 dataset, we may see that the only variable that did not have greater differences between the matched records and those that did not is the basic housing conditions and the standardized average days worked or assisted to educational institutions. The rest may vary greatly.


To get an idea of the distribution of the biopsychosocial compromise among the selected sample, we generated a pie chart.


library(plotly)

CONS_C1_df_dup_JUL_2020_match_top_sel %>%
     dplyr::group_by(compromiso_biopsicosocial)%>%
      dplyr::mutate(compromiso_biopsicosocial=case_when(compromiso_biopsicosocial=="2-Moderado"~"Moderate",
                                                        compromiso_biopsicosocial=="1-Leve"~"Mild",
                                                        TRUE~"Severe"))%>%
       dplyr::summarise(n=n(),percent=round(n/sum(n),2))%>% 
     dplyr::ungroup()%>%
plot_ly( labels = ~compromiso_biopsicosocial, values = ~n, type = 'pie',hole = 0.6,textposition = 'outside',textinfo = 'label+percent', marker = list(colors = c("yellowgreen","orange","tomato"))) %>%
  layout(xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
         yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))

Figure 2. Biopsychosocial Compromise


We generated a correlation plot which wraps different type of variables (continuous, polytomous, dichotomous).1 We selected mostly the original variables from the instrument.


# It first identifies the types of variables, organizes them by type (continuous, polytomous, dichotomous), calls the appropriate correlation function, and then binds the resulting matrices together.D
mix_cor<- psych::mixedCor(data=CONS_C1_df_dup_JUL_2020_match_top_sel%>%dplyr::mutate(compromiso_biopsicosocial=as.numeric(compromiso_biopsicosocial)),c=2:20, #total_oh,dosis_oh total_thc dosis_thc total_pbc dosis_pbc total_coc dosis_coc total_bzd dosis_bzd  total_otra dosis_otra total_vif  total_transgresion salud_psicologica total_trabajo total_educacion salud_fisica calidad_vida 
         p=1,#compromiso_biopsicosocial,
         d=21:27,#, hurto robo  venta_drogas rina  otro lugar_vivir vivienda
         smooth=TRUE,
         correct=.5,
         global=TRUE,
         ncat=8,
         use="pairwise",
         method="kendall",
         weight=NULL)
# Getting the correlation matrix of the dataset.
comp_biopsicosoc_mat <- mix_cor$rho

# Explore the correlation structure of the dataset.
library(ggcorrplot)
ggcorrplot(comp_biopsicosoc_mat,
          ggtheme = ggplot2::theme_gray,
          colors = c("#E46726", "white", "#6D9EC1"),
           tl.cex=8)
Figure 3.Matrix Of Correlations Between Original Variables

Figure 3.Matrix Of Correlations Between Original Variables

#subscript out of bounds

#mixedCor(data=CONS_C1_df_dup_JUL_2020_match_top_sel,c=c("total_oh","dosis_oh", "total_thc", "dosis_thc", "total_pbc", "dosis_pbc", "total_coc", "dosis_coc", "total_bzd", "dosis_bzd", "total_otra", "dosis_otra", "total_vif",  "total_transgresion", "salud_psicologica", "total_trabajo", "total_educacion", "salud_fisica", "calidad_vida"), p="compromiso_biopsicosocial",d=c("hurto", "robo",  "venta_drogas", "rina",  "otro", "lugar_vivir", "vivienda"), smooth=TRUE,correct=.5,global=TRUE,ncat=8,use="pairwise",method="pearson",weight=NULL)

#subscript out of bounds
#set.seed(123)
#par(mar=c(1,1,1,1))
#comp_biopsicosoc_mat_ci<-mycor.ci(x=CONS_C1_df_dup_JUL_2020_match_top_sel%>%dplyr::mutate(compromiso_biopsicosocial=as.numeric(compromiso_biopsicoso#cial)), 
#                                  method="spearman", 
#                                  poly=TRUE, 
#                                  cvars=2:20, 
#                                  pvars=1,
#                                  dvars=21:27, 
#                                   n.iter=1000,
#                                  plot = F)
#
#cor_pmat(): para ver la sig estadística de las correlaciones


Also we included an heterogeneus matrix of correlations, consisting of polychoric, biserial and pearson correlations.2 In the following matrix, we selected the recoded variables useful to contrast our hypotheses.


require(polycor)
#Computes a heterogenous correlation matrix, consisting of Pearson product-moment correlations between numeric variables, polyserial correlations between numeric and ordinal variables, and polychoric correlations between ordinal variables.
hetcor_mat<-hetcor(CONS_C1_df_dup_JUL_2020_match_top_sel[c("compromiso_biopsicosocial",
                                                           "dosis_mean_z","total_mean_z","salud_mean_z","total_trabajo_edu_z","hurto","robo","venta_drogas","rina","otro","lugar_vivir","vivienda","vif",
                                                          "dg_cons_dep","sin_otros_prob_sm","abandono_temp","dg_cie_10_rec")], ML = T, std.err = T, use="complete.obs", bins=4, pd=TRUE)
#CONS_C1_df_dup_JUL_2020_match_top_sel%>% select(contains("dosis"))%>% hist()

ggcorrplot(hetcor_mat$correlations,
          ggtheme = ggplot2::theme_gray,
          colors = c("#E46726", "white", "#6D9EC1"), tl.cex=8)
Figure 4.Matrix Of Correlations Between Transformed Variables

Figure 4.Matrix Of Correlations Between Transformed Variables


Variables of TOP depending on Biopsychosocial Compromise


library(reshape2)  # for melt() function

Attaching package: 'reshape2'
The following objects are masked from 'package:data.table':

    dcast, melt
The following object is masked from 'package:tidyr':

    smiths
library(ggplot2)
# First we need to restructure the data into long format:
ocdMelt <- melt(CONS_C1_df_dup_JUL_2020_match_top_sel[c('compromiso_biopsicosocial','hurto',"robo","venta_drogas","rina","otro","lugar_vivir","vivienda","vif")], id=c('compromiso_biopsicosocial'))
names(ocdMelt) <- c('Group', 'Outcome_Measure', 'Frequency')
ocdMelt <-ocdMelt%>% dplyr::mutate(Frequency=factor(Frequency))%>% dplyr::group_by(Outcome_Measure,Group,Frequency)%>% dplyr::arrange(Outcome_Measure, Group,Frequency)%>% dplyr::add_tally()%>% distinct(.,.keep_all=T)%>% dplyr::ungroup()%>% dplyr::group_by(Group, Outcome_Measure)%>% dplyr::mutate(perc=n/sum(n))%>% dplyr::ungroup()

# plot
p5<-ggplot(ocdMelt, aes(Outcome_Measure,perc))+
  geom_bar(aes(fill= Frequency),stat="identity")+
  facet_grid(Group~.)+
  scale_y_continuous(labels = scales::percent_format(accuracy = 5L),limits=c(0,1))+
  scale_fill_manual(values=c("grey60","steelblue"),labels=c("Absence", "Presence")) +
  sjPlot::theme_sjplot2() + 
  ylab("Biopsychosocial Compromise")+
  theme(axis.text.x = element_text(size=8,vjust = .5,angle=25), axis.text.y = element_text(size=10),
        axis.title.y = element_text(size = 14)) +
  theme(
    panel.grid.minor = element_blank(), 
    panel.grid.major = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_blank(),
    line = element_blank())
p5
Figure 5. Bar Plot of Binary Answers By Biopsychosocial Compromise

Figure 5. Bar Plot of Binary Answers By Biopsychosocial Compromise


We generated z scores for those composite scores of multiple variables, in order to make them comparable. In the following figure we may observe the distribution of continuous variables among users with different levels of biopsychosocial compromise.


library(reshape2)  # for melt() function
library(ggplot2)
# First we need to restructure the data into long format:
ocdMelt <- melt(CONS_C1_df_dup_JUL_2020_match_top_sel[c('compromiso_biopsicosocial',"salud_mean_z","dosis_mean_z", "total_mean_z","total_trabajo_edu_z")], id=c('compromiso_biopsicosocial'))
names(ocdMelt) <- c('Group', 'Outcome_Measure', 'Frequency')
# plot
ocdBoxplot <- ggplot(ocdMelt, aes(Group, Frequency, color = Outcome_Measure)) + 
  geom_jitter(alpha=0.05) +
  geom_boxplot() + 
  labs(x='Biopsychosocial Compromise', y='Scores', color='Variable') + scale_y_continuous(breaks=seq(0,50, by=5))+  
  sjPlot::theme_sjplot2() + 
  scale_color_manual(values=c("darkslategray4","rosybrown","dimgray","coral3","grey60"),na.value="black") + #"Set2"
  theme(
    panel.grid.minor = element_blank(), 
    panel.grid.major = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_blank(),
    line = element_blank())
ocdBoxplot
Warning: Removed 11 rows containing non-finite values (stat_boxplot).
Warning: Removed 11 rows containing missing values (geom_point).
Figure 6. Box Plot of Continuous Answers By Biopsychosocial Compromise

Figure 6. Box Plot of Continuous Answers By Biopsychosocial Compromise


As seen above, users categorized with a severe Biopsychosocial Compromise showed slightly greater patterns of consumption and lower average health in comparison to users with a mild Biopsychosocial Compromise. However, and at least in an descriptive view, they seem to not differ much between the different levels.


We checked the multivariate normality of these four last variables, in order to choose the most adequate technique to perform the analyses. These analyses may let us infer whether that users in different levels of Biopsychosocial Compromise derive from the same population in terms of scores in the Treatment Outcomes Profile.


Table 3. Multivariate Normality Test
Test Statistic P Value Normality
Mardia Skewness 26,903.493 0.000 NO
Mardia Kurtosis 218.874 0.000 NO
MVN NA NA NO
Note. Using z-scores distributions of Average Health, Average days worked or in assistance to educational institutions, Average Frequency of Use in the last Month and Average Dose in the last Month


As seen above, the continuous variables did not follow a normal distribution.


Non-parametric Multivariate Test

Considering that our variables did not follow a normal distribution, and the test is composed by a mixture of different variable types, we performed a nonparametric test for multivariate samples. The variability of responses on each of the variables of top was compared between the groups using a non-parametric multivariate ANOVA (mnpv)3 with biopsychosocial compromise as between-subject factor (discarding inter-observer variability). The procedure generates a randomization test of 5,000 resamples to compute F-statistic (pseudo-F). Additionally, these models have the advantage that allow for differences in between-group variation, is not sensitive to multicollinearity, allows for the inclusion of more variables, and tolerate sparse data.


#mvnormtest::mshapiro.test()

dependent.vars <- cbind(CONS_C1_df_dup_JUL_2020_match_top_sel$salud_mean_z,CONS_C1_df_dup_JUL_2020_match_top_sel$dosis_mean_z, CONS_C1_df_dup_JUL_2020_match_top_sel$total_mean_z)
root.manova <- manova(dependent.vars ~ CONS_C1_df_dup_JUL_2020_match_top_sel$compromiso_biopsicosocial, intercept=F)
summary(root.manova,test="Pillai")
#test='Pillai'). The manova() function provides four multivariate tests by setting the test argument to either Pillai, #Wilks, Hotelling-Lawley, or Roy.

#La prueba de Wilk es la más comúnmente utilizada debido a que fue la primera prueba en ser derivada y a que posee una aproximación F bastante conocida.
#Lawley-Hotelling es también conocida como el estadístico T2 generalizado de Hotelling.
#La prueba de Pillai es la mejor prueba a utilizar en la mayoría de las situaciones. La prueba de Pillai proporciona resultados similares a los de las pruebas de Wilks y de Lawley-Hotelling.
#La prueba de la raíz más grande de Roy es la mejor cuando los vectores medios son colineales. La prueba de Roy no tiene una aproximación F satisfactoria.

summary.aov(root.manova)
CONS_C1_df_dup_JUL_2020_match_top_sel2<-
  CONS_C1_df_dup_JUL_2020_match_top_sel%>%
      dplyr::mutate_at(vars(hurto, robo,  venta_drogas, rina,  otro, lugar_vivir, vivienda,vif), ~as.numeric(as.factor(as.numeric(.)-1)))
#https://stackoverflow.com/questions/59166335/need-help-using-npmv-package-nonparametric-testing-in-r

invisible(
prueba_anormalidad1<-npmv::nonpartest(dosis_mean_z|total_mean_z|salud_mean_z|total_trabajo_edu_z|hurto|robo|venta_drogas|rina|otro|lugar_vivir|vivienda|vif~compromiso_biopsicosocial, CONS_C1_df_dup_JUL_2020_match_top_sel2,plots=F,permreps=5000,releffects=TRUE)
)

prueba_anormalidad_res1<-data.table::data.table(prueba_anormalidad1$results, keep.rownames=T)

#INOCRPORË LAS VAR DE CONTROL. SON TODAS DE TIPO FACTOR

sum_df_npmv<-data.frame(cbind(Model=paste0("npmv",c("Raw","Raw","Raw","Raw")),
  rbind(prueba_anormalidad_res1)))%>%
  dplyr::rename("Parameter"="rn")%>%
  dplyr::mutate(`P.value`=sprintf("%5.3f",P.value), `Permutation.Test.p.value`=sprintf("%5.3f",Permutation.Test.p.value))

sum_df_npmv%>% 
  dplyr::mutate(Test.Statistic=round(as.numeric(as.character(Test.Statistic)),3))%>% 
  knitr::kable(.,format = "html", format.args = list(decimal.mark = ".", big.mark = ","),
               caption="Table 4. Analysis of Variance",
                 align =rep('c', 101),
               col.names=c("Model", "Parameter", "Statistic", "DF1", "DF2","p-value", "Permutation Test p-value"))%>%
   kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size = 8) %>%
  kableExtra::add_footnote( c("Note. Model with average dose (z score), average frequency of use (z score), average reported health (z score), average days worked or attending an educational institution (z score), theft, burglary, drug selling, fights, other, place to live, housing condition and domestic violence."), notation = "none") %>%
  kableExtra::scroll_box(width = "100%", height = "375px")
Table 4. Analysis of Variance
Model Parameter Statistic DF1 DF2 p-value Permutation Test p-value
npmvRaw ANOVA type test p-value 176.234 12.152 25,236.39 0.000 0.000
npmvRaw McKeon approx. for the Lawley Hotelling Test 57.292 24.000 10,411.40 0.000 0.000
npmvRaw Muller approx. for the Bartlett-Nanda-Pillai Test 52.297 24.012 12,023.99 0.000 0.000
npmvRaw Wilks Lambda 54.793 24.000 12,018.00 0.000 0.000
Note. Model with average dose (z score), average frequency of use (z score), average reported health (z score), average days worked or attending an educational institution (z score), theft, burglary, drug selling, fights, other, place to live, housing condition and domestic violence.
##NO OCUPAR: MUY COMPLICADO
#The function ssnonpartest performs an all-subset algorithm to determine which variables cause significant effects, and between which factor levels. See the examples below for some basic uses and look in the help pages for each function for a much more detailed look.
#invisible(
#npmv::ssnonpartest(dosis_mean|total_mean|trabajo_edu_mean~compromiso_biopsicosocial,CONS_C1_df_dup_JUL_2020_match_top_sel,test=c(1,0,0,0),alpha=.05,factors.and.variables=TRUE)
#)


We also included the non-parametric relative treatment efects (RTE) for each variable in probabilities.


data.table(prueba_anormalidad1$releffects,keep.rownames = T)%>% 
  #mutate_at(vars(-rn),~scales::percent(.))%>%
  knitr::kable(.,format = "html", format.args = list(decimal.mark = ".", big.mark = ","),
               caption="Table 5. Non-parametric treaatment effects for each variable",
                 align =rep('c', 101),
               col.names=c("Levels of Biopsychosocial Compromise","Std.Avg.Dose", "Std.Avg.Drug.Cons.", "Std.Avg.Health", "Std.Avg.Work  & Education", "Robbery","Theft", "Drug-Selling","Fights", "Other Transgressions", "Stable Place to Live", "Basic Housing Cond.", "Dom. Violence"))%>%
   kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size = 8)%>%
    kableExtra::scroll_box(width = "100%", height = "225px")
Table 5. Non-parametric treaatment effects for each variable
Levels of Biopsychosocial Compromise Std.Avg.Dose Std.Avg.Drug.Cons. Std.Avg.Health Std.Avg.Work & Education Robbery Theft Drug-Selling Fights Other Transgressions Stable Place to Live Basic Housing Cond. Dom. Violence
1-Leve 0.42667 0.36441 0.62646 0.62207 0.45544 0.47374 0.48237 0.45428 0.49816 0.52768 0.52802 0.46333
2-Moderado 0.49507 0.47526 0.52688 0.53383 0.49082 0.49072 0.49321 0.49211 0.49790 0.51852 0.51753 0.50058
3-Severo 0.53752 0.59716 0.40261 0.39199 0.53380 0.52679 0.51898 0.53195 0.50445 0.45622 0.45785 0.51335


The can see that in average, there was a greater probability that the group of users with a severe biopsychosocial compromise shows a greater percentage of each component of Treatment Outcomes Profile than a randomly chosen group, excepting those components related to health & social functioning.


CONS_C1_df_dup_JUL_2020_match_top_sel2<-
  CONS_C1_df_dup_JUL_2020_match_top_sel%>%
      dplyr::mutate_at(vars(hurto, robo,  venta_drogas, rina,  otro, lugar_vivir, vivienda,vif), ~as.numeric(as.factor(as.numeric(.)-1)))
#https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/adonis
#t looks like the jaccard distance is really only useful for binary data (presence/absence) while The bray-curtis matrix has been found to be robust for many abundance type data sets, especially those with many paired zeros like stomach content data can have.

# Test of significance of the canonical relationship:
# 6.3. Canonical correspondence analysis (CCA)
# Beware: there are cca functions in other R packages, in particular in ade4.
#http://jinliang.weebly.com/uploads/2/5/7/8/25781074/analysis_of_community_ecology_data_in_r.pdf
#
#Statistical testing of microbial beta diversity between treatments was conducted in R using the vegan R package ADONIS function with the default 999 permutations. The betadisper function from the vegan R package was used to analyze variance between groups. Differential abundance analysis of the OTUs in microcosms of different treatments was performed using the DESeq2 R package. The DESeq2 generated the log fold change values, normalized relative abundances (“basemean”), and p-values from the Wald test with local fitting.
#ADONIS results indicate the effect of each factor, substrate, concentration and timepoint, on
#microbial community dissimilarity
vegan_1<- 
  vegan::adonis(CONS_C1_df_dup_JUL_2020_match_top_sel2[,c("dosis_mean_z", "total_mean_z", "salud_mean_z","total_trabajo_edu_z", "hurto", "robo", "venta_drogas", "rina", "otro", "lugar_vivir", "vivienda", "vif"),drop = FALSE] ~ compromiso_biopsicosocial, data=CONS_C1_df_dup_JUL_2020_match_top_sel,permutations=1000, parallel = 4, na.rm=T)
vegan_2<- 
    vegan::adonis(CONS_C1_df_dup_JUL_2020_match_top_sel2[,c("dosis_mean_z", "total_mean_z", "salud_mean_z","total_trabajo_edu_z", "hurto", "robo", "venta_drogas", "rina", "otro", "lugar_vivir", "vivienda", "vif"),drop = FALSE] ~ compromiso_biopsicosocial, data=CONS_C1_df_dup_JUL_2020_match_top_sel,permutations=1000, parallel = 4, na.rm=T, method="gower")
#https://trace.tennessee.edu/cgi/viewcontent.cgi?article=5817&context=utk_graddiss

#v To test for differences in dispersion between groups (permutational multivariate analysis of dispersion [PERMDISP] was significant), we used the betadisper function in vegan


#(PERMANOVA pseudo-F = 3.95, p = .0001), and all pairwise comparisons were significant (p < .001, 999 permutations). The PERMDISP was significant (F = 11.47, p = .0003), indicating unequal variances among vegetation habitats.
# To test for differences in dispersion between groups (permutational multivariate analysis of dispersion [PERMDISP] was significant), we used the betadisper function in vegan
#to determine if there were significant differences in MDS clusters by sample origin, an analysis of variance was conducted using the function “adonis” in the R package “vegan”. 

sum_df_vegan<-data.frame(cbind(Model=paste0("vegan",c("Raw","Raw","Raw")),
          rbind(data.table::data.table(vegan_1$ aov.tab, keep.rownames=T))))%>%
            dplyr::rename("Parameter"="rn")%>%
  dplyr::mutate(`Pr..F.`=sprintf("%5.3f",`Pr..F.`))

sum_df_vegan%>% 
  dplyr::mutate(R2=round(as.numeric(as.character(R2)),3))%>% 
  knitr::kable(.,format = "html", format.args = list(decimal.mark = ".", big.mark = ","),
               caption="Table 6. Analysis of Variance using bray-curtis matrix",
                 align =rep('c', 101),
               col.names=c("Model","Parameter", "Df","SS", "Mean Squares","Pseudo F","R2", "p-value"))%>%
   kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size = 8) %>%
  kableExtra::add_footnote( c("Note. Model with average dose (z score), average frequency of use (z score), average reported health (z score), average days worked or attending an educational institution (z score), theft, burglary, drug selling, fights, other, place to live, housing condition and domestic violence."), notation = "none")%>%
  kableExtra::scroll_box(width = "100%", height = "375px")
Table 6. Analysis of Variance using bray-curtis matrix
Model Parameter Df SS Mean Squares Pseudo F R2 p-value
veganRaw compromiso_biopsicosocial 2 15.07365 7.5368228 207.1233 0.064 0.001
veganRaw Residuals 6,031 219.45665 0.0363881 NA 0.936 NA
veganRaw Total 6,033 234.53029 NA NA 1.000 NA
Note. Model with average dose (z score), average frequency of use (z score), average reported health (z score), average days worked or attending an educational institution (z score), theft, burglary, drug selling, fights, other, place to live, housing condition and domestic violence.
get_adjusted_r2 <- function(adonis_object) {
  n_observations <- ncol(adonis_object$coef.sites)
  d_freedom <- adonis_object$aov.tab$Df[1]
  r2 <- adonis_object$aov.tab$R2[1]
  adjusted_r2 <- vegan::RsquareAdj(r2, n_observations, d_freedom)
  adjusted_r2
}

#dg_cons_dep", "sin_otros_prob_sm","abandono_temp",,"dg_cie_10"
#https://stackoverflow.com/questions/16638297/vegdist-error-message-in-vegan-package


Despite both tests indicated significant differences between the levels of biopsychosocial compromise in scores of the TOP, the r-square let us interpret that these differences only explain the 6% of the variances of the scores, in such way that levels in the biopsychosocial compromise may not be capable of predicting a determin, affecting the generalizab


Relationship with outcomes

library(glmulti)
#sin_otros_prob_sm+ dg_cons_dep
res <- glmulti(abandono_temp ~ compromiso_biopsicosocial + origen_ingreso_mod + dg_cie_10_rec + dg_trs_fisico, #tal vez  no debería meter esta última variable.
               data=CONS_C1_df_dup_JUL_2020_match_top_sel2%>%dplyr::mutate(compromiso_biopsicosocial=case_when(compromiso_biopsicosocial=="2-Moderado"~"Moderate",compromiso_biopsicosocial=="1-Leve"~"Mild",TRUE~"Severe")),
               level=2, #The level of interaction between explanatory variables to be considered. Currently only 1 (only main effects) or 2 (main effects plus all pairwise interactions) are supported.
               fitfunction=glm, crit="aicc", 
               family=binomial,
               marginality=T,#Whether to use the general marginality rule or not. Default is FALSE. With TRUE, all predictors found in interaction terms are also included as main effects.
               confsetsize = 1000 #How many models should be returned in the confidence set of models?
               )
Initialization...
TASK: Exhaustive screening of candidate set.
Fitting...

After 50 models:
Best model: abandono_temp~1+origen_ingreso_mod+dg_cie_10_rec+dg_trs_fisico
Crit= 5556.33253408953
Mean crit= 5709.57787753471

After 100 models:
Best model: abandono_temp~1+origen_ingreso_mod+dg_cie_10_rec+dg_trs_fisico+compromiso_biopsicosocial+dg_cie_10_rec:compromiso_biopsicosocial
Crit= 5555.50391620977
Mean crit= 5636.49090121402
Completed.
#t is not iterative (all models are compared), and is fully automatic (no intervention of the user is required). It is primarily intended to be used for exploratory analyses with many candidate explanatory variables (say 10 to 20)
plot(res)
##########################################################################################
##################DEBIESE INCORPORAR EL TIPO DE PLAN!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
##########################################################################################

#http://www.metafor-project.org/doku.php/tips:model_selection_with_glmulti_and_mumin

#But let's take a look at the top 10 models:
top <- weightable(res) #Prepares a table with model formulas, IC values and IC relative supports
top <- top[top$aicc <= min(top$aicc) + 5,]

############################################################################################

###########################################################################################
plot.glmulti<-function(x, type="p", ...) 
{
  if ( type == "p") {
    plot(x@crits,xlab="Best models",
         ylab=paste("Support (",x@params$crit,")"),pch=19, main="IC profile", ...)
    abline(h=x@crits[1]+2,col="red")
        return(assign("ic_glmulti",data.table::data.table(data.frame(imp),keep.rownames = T),envir = .GlobalEnv))

  } else if (type == "w") {
    ww = exp(-(x@crits - x@crits[1])/2)
    ww=ww/sum(ww)
    plot(ww,xlab="Best models",
         ylab=paste("Evidence weight (",x@params$crit,")"),pch=19, main="Profile of model weights", ...)
    cucu=function(i) sum(ww[1:i])
    wwc=lapply(1:length(ww),cucu)
    abline(v=min(which(wwc>=0.95)),col="red")
  } else if (type=="r") {
    if (length(x@objects)) {
      # shows some diagnostics of the fit
      dev.new()
      par(mfrow=c(2,min(length(x@crits), 5)))
      for (k in 1:min(length(x@crits), 5)) 
        plot(x@objects[[k]],which=c(1), main=deparse(x@formulas[[k]]),...)
      for (k in 1:min(length(x@crits), 5)) 
        plot(x@objects[[k]],which=c(2),...)
    } else warning("Unavailable: use includeobjects=T when calling glmulti.")       
  } else if (type=="s") {
    # plots variable (i.e. terms) importances
    ww = exp(-(x@crits - x@crits[1])/2)
    ww=ww/sum(ww)
    # handle synonymies for interactions
    
    # this translates to unique notations (e.g. x:y and y:x are the same)
    clartou=function(x) {
      sort(strsplit(x, ":")[[1]])-> pieces
      if (length(pieces)>1) paste(pieces[1],":",pieces[2],sep="")
      else x
    }
    # list terms in models
    tet = lapply(x@formulas, function(x) sapply(attr(delete.response(terms(x)),"term.labels"), clartou))
    # all unique terms
    unique(unlist(tet))-> allt
    # importances
    sapply(allt, function(x) sum(ww[sapply(tet, function(t) x%in%t)]))-> imp
    # draw
    #par(oma=c(0,13,0,0))
    barplot(sort(imp),xlab="",xlim=c(0,1), ylab="",horiz=T,las=2, names.arg=allt[order(imp)],main="Model-averaged importance of terms", ...)
    abline(v=0.8, col="red")
    return(assign("var_imp_glmulti",data.table::data.table(data.frame(imp),keep.rownames = T),envir = .GlobalEnv))
  } else    warning("plot: Invalid type argument for plotting glmulti objects.")
}
plot(res, type="s")
############################################################################################
data.frame(res@crits)%>% dplyr::mutate(rn=row_number())%>%
    ggplot(aes(x=rn,y=res.crits))+
    geom_point(color="steelblue")+
   sjPlot::theme_sjplot2() + 
  ylab("Support (aicc)")+
  theme(axis.text.x = element_text(size=8,vjust = .5,angle=25), axis.text.y = element_text(size=10),
        axis.title.y = element_text(size = 14)) +
  theme(
    panel.grid.minor = element_blank(), 
    panel.grid.major = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_blank(),
    line = element_blank())+
  geom_hline(yintercept=res@crits[1]+2, color="red", linetype=2,size=1)+
  xlab("Best models")
Figure 7. Ordered Models and AICC

Figure 7. Ordered Models and AICC


We selected the model with lower AIC. The first model contained shown the more parsimonius indices (AICc= 5555.46, IC Rel. Support= 0.104).


#par(oma=c(0,0,0,24))
ggplot(var_imp_glmulti, aes(x=reorder(rn, -imp),y=imp))+
  geom_bar(stat="identity",alpha=.7, fill="steelblue")+
    sjPlot::theme_sjplot2() + 
  ylab("Model-averaged importance")+
  theme(axis.text.x = element_text(size=6.5,vjust = .5,angle=25), 
        axis.text.y = element_text(size=10),
        axis.title.y = element_text(size = 14)) +
  theme(
    panel.grid.minor = element_blank(), 
    panel.grid.major = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_blank(),
    line = element_blank())+
  geom_hline(yintercept=.5, color="red", linetype=2)+
  xlab("Terms")
Figure 8. Model-averaged variable importance of terms

Figure 8. Model-averaged variable importance of terms


The average importance of the term biopsychosocial compromise had the third major importance over the outcome variable (early drop-out) among the 1,000 models tested.


df_coef_glmulti<- data.frame()
for (i in 1:nrow(summary(res@objects[[1]])$coefficients)){
    df<-cbind(attr(summary(res@objects[[1]])$coefficients,"dimnames")[[1]][i],
            t(data.frame(exp(summary(res@objects[[1]])$coefficients[i,1] + 
          +     qnorm(c(0.025,.5,0.975)) * summary(res@objects[[1]])$coefficients[i,2]))))
    df_coef_glmulti<-rbind(df_coef_glmulti,df)
          }
df_coef_glmulti<-data.table::data.table(df_coef_glmulti)

data.table::data.table(summary(res@objects[[2]])$coefficients,keep.rownames = T)%>%
  dplyr::left_join(df_coef_glmulti,by=c("rn"="V1"))%>%
  dplyr::select(-V3)%>%
  dplyr::mutate(V2=round(as.numeric(as.character(V2)),2))%>%
  dplyr::mutate(V4=round(as.numeric(as.character(V4)),2))%>%
  dplyr::mutate(Estimate=round(as.numeric(as.character(Estimate)),2))%>%
  dplyr::mutate(OR=round(exp(as.numeric(as.character(Estimate))),2))%>%
  #exp(summary(m)$coefficients["DSH",1] + 
#+     qnorm(c(0.025,0.5,0.975)) * summary(m)$coefficients["DSH",2])
#[1]  1.472098  4.197368 11.967884
  dplyr::mutate(`Std. Error`=round(as.numeric(as.character(`Std. Error`)),2))%>% 
  dplyr::mutate(`z value`=round(as.numeric(as.character(`z value`)),2))%>% 
  dplyr::mutate(`Pr(>|z|)`=round(as.numeric(as.character(`Pr(>|z|)`)),4))%>% 
  dplyr::rename("Terms"="rn", "OR_lo"="V2","OR_up"="V4")%>%
  dplyr::select(Terms,Estimate,OR,OR_lo,OR_up, everything())%>%
  knitr::kable(.,format = "html", format.args = list(decimal.mark = ".", big.mark = ","),
               caption="Table 7. Logistic Regression Coefficients",
                 align =rep('c', 101))%>%
   kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),font_size = 8)%>%
  kableExtra::add_footnote( c(paste("Note. AICc=",round(res@objects[[1]]$aic,3)),
                              paste("Defined model=",top$model[1])), notation = "none")%>%
  kableExtra::scroll_box(width = "100%", height = "375px")
Table 7. Logistic Regression Coefficients
Terms Estimate OR OR_lo OR_up Std. Error z value Pr(>|z|)
(Intercept) -2.08 0.12 0.07 0.16 0.21 -9.99 0.0000
origen_ingreso_modDerivación Asistida -0.22 0.80 0.63 1.02 0.12 -1.82 0.0690
origen_ingreso_modOtros -0.62 0.54 0.37 0.77 0.19 -3.32 0.0009
origen_ingreso_modSector Justicia -0.42 0.66 0.51 0.87 0.14 -3.06 0.0022
origen_ingreso_modSector Salud -0.39 0.68 0.57 0.79 0.08 -4.74 0.0000
dg_cie_10_rec1 1.59 4.90 3.00 7.33 0.23 6.86 0.0000
dg_trs_fisico1 -0.72 0.49 0.58 0.77 0.22 -3.26 0.0011
compromiso_biopsicosocialModerate -0.04 0.96 0.72 1.64 0.24 -0.18 0.8582
compromiso_biopsicosocialSevere 0.30 1.35 1.09 2.75 0.27 1.14 0.2527
dg_cie_10_rec1:compromiso_biopsicosocialModerate 0.02 1.02 0.65 1.76 0.26 0.09 0.9306
dg_cie_10_rec1:compromiso_biopsicosocialSevere -0.45 0.64 0.40 1.18 0.28 -1.61 0.1076
dg_trs_fisico1:compromiso_biopsicosocialModerate 0.28 1.32 NA NA 0.24 1.17 0.2436
dg_trs_fisico1:compromiso_biopsicosocialSevere 0.48 1.62 NA NA 0.26 1.88 0.0596
Note. AICc= 5555.46
Defined model= abandono_temp ~ 1 + origen_ingreso_mod + dg_cie_10_rec + dg_trs_fisico + compromiso_biopsicosocial + dg_cie_10_rec:compromiso_biopsicosocial
#abandono_temp ~ 1 + compromiso_biopsicosocial + origen_ingreso_mod + dg_cie_10 + dg_trs_fisico + dg_trs_fisico:compromiso_biopsicosocial
#print(res@formulas[[2]])


The next figures show the marginal effects of the interaction terms among the variables.


library(lsmeans)
refR_1 <- lsmeans(res@objects[[1]], specs = c("compromiso_biopsicosocial","dg_trs_fisico"))
ref_dfR_1 <- as.data.frame(cbind(summary(refR_1)[1:2],exp(summary(refR_1)[,c(3,6,7)])))

g4R_1 <- ggplot(ref_dfR_1, aes(x=compromiso_biopsicosocial, y=lsmean,group=dg_trs_fisico, colour=dg_trs_fisico))+
geom_errorbar(aes(ymin=asymp.LCL, ymax=asymp.UCL), width=.1,position=position_dodge(0.1), size=1) +
geom_line(position=position_dodge(0.1), size=.5)+
geom_point(position=position_dodge(0.1), size=2)+
  xlab("Biopsychosocial Compromise")+
  ylab("Odds Ratio (OR) of Early Drop-out")+
  sjPlot::theme_sjplot2() +
  geom_rect_interactive(alpha = 0.1, xmin=.1, xmax=.1, ymin=.1,ymax=.1) +
    # Remove plot elements added by geom_rect_interactive
  theme(legend.position="bottom")+
  guides(color=guide_legend(ncol=4))+
  labs(color="Cause of Discharge")+
  scale_colour_brewer(name="Cumulative Percentage", palette = "Set1",labels=c("No Dg.", "In study or Dg."))+
  theme(legend.title = element_blank())
  #theme(axis.text = element_blank(), axis.ticks = element_blank())

ggiraph(code = print(g4R_1))

Figure 9. Interactive Effects in the Prob. of Early Drop-out with Diagnose of Physical Diagnosis

library(lsmeans)
refR_2 <- lsmeans(res@objects[[1]], specs = c("compromiso_biopsicosocial","dg_cie_10_rec"))
ref_dfR_2 <- as.data.frame(cbind(summary(refR_2)[1:2],exp(summary(refR_2)[,c(3,6,7)])))

g4R_2 <- ggplot(ref_dfR_2, aes(x=compromiso_biopsicosocial, y=lsmean,group=dg_cie_10_rec, colour=dg_cie_10_rec))+
geom_errorbar(aes(ymin=asymp.LCL, ymax=asymp.UCL), width=.1,position=position_dodge(0.1), size=1) +
geom_line(position=position_dodge(0.1), size=.5)+
geom_point(position=position_dodge(0.1), size=2)+
  xlab("Psychiatric Condition")+
  ylab("Odds Ratio (OR) of Early Drop-out")+
  sjPlot::theme_sjplot2() +
  geom_rect_interactive(alpha = 0.1, xmin=.1, xmax=.1, ymin=.1,ymax=.1) +
    # Remove plot elements added by geom_rect_interactive
  theme(legend.position="bottom")+
  guides(color=guide_legend(ncol=4,name = "Cause of Discharge"))+
  labs(color="Motive of Admission to Treatment")+
  scale_colour_brewer(palette = "Set1",labels=c("No Dg.", "In study or Dg."))+#,labels=c("No Dg.", "In study", "Dg."))+
  theme(legend.title = element_blank())
  #theme(axis.text = element_blank(), axis.ticks = element_blank())

ggiraph(code = print(g4R_2))

Figure 10. Interactive Effects in the Prob. of Early Drop-out with Diagnose of Psychiatric Diagnosis

save.image("G:/Mi unidad/Alvacast/SISTRAT 2019 (github)/biopsychosoc_comp.RData")
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Chile.1252  LC_CTYPE=Spanish_Chile.1252   
[3] LC_MONETARY=Spanish_Chile.1252 LC_NUMERIC=C                  
[5] LC_TIME=Spanish_Chile.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] gdtools_0.2.2           glmulti_1.0.8           leaps_3.1              
 [4] rJava_0.9-12            reshape2_1.4.4          polycor_0.7-10         
 [7] DiagrammeR_1.0.6.1.9000 compareGroups_4.4.5     MVN_5.8                
[10] citr_0.3.2              ggcorrplot_0.1.3        psych_2.0.7            
[13] ggiraph_0.7.0           finalfit_1.0.1          lsmeans_2.30-0         
[16] emmeans_1.4.7           choroplethrAdmin1_1.1.1 choroplethrMaps_1.0.1  
[19] choroplethr_3.6.3       acs_2.1.4               XML_3.99-0.3           
[22] RColorBrewer_1.1-2      panelr_0.7.3            lme4_1.1-23            
[25] Matrix_1.2-18           dplyr_1.0.0             data.table_1.12.8      
[28] codebook_0.9.2          Statamarkdown_0.4.5     devtools_2.3.0         
[31] usethis_1.6.1           sqldf_0.4-11            RSQLite_2.2.0          
[34] gsubfn_0.7              proto_1.0.0             broom_0.7.0            
[37] zoo_1.8-8               altair_4.0.1            rbokeh_0.5.1           
[40] janitor_2.0.1           plotly_4.9.2.1          kableExtra_1.1.0       
[43] Hmisc_4.4-0             Formula_1.2-3           survival_3.1-12        
[46] lattice_0.20-41         ggplot2_3.3.1           stringr_1.4.0          
[49] stringi_1.4.6           tidyr_1.1.0             knitr_1.29             
[52] matrixStats_0.56.0      boot_1.3-25            

loaded via a namespace (and not attached):
  [1] vcd_1.4-7             class_7.3-17          ps_1.3.3             
  [4] lmtest_0.9-37         rprojroot_1.3-2       crayon_1.3.4         
  [7] laeken_0.5.1          MASS_7.3-51.6         nlme_3.1-148         
 [10] backports_1.1.8       rlang_0.4.6           readxl_1.3.1         
 [13] performance_0.4.6     nloptr_1.2.2.1        callr_3.4.3          
 [16] flextable_0.5.10      rjson_0.2.20          ggmap_3.0.0          
 [19] bit64_0.9-7           glue_1.4.1            sjPlot_2.8.4         
 [22] processx_3.4.2        classInt_0.4-3        tcltk_4.0.2          
 [25] haven_2.3.1           tidyselect_1.1.0      NADA_1.6-1.1         
 [28] rio_0.5.16            sf_0.9-3              sjmisc_2.8.5         
 [31] chron_2.3-55          xtable_1.8-4          magrittr_1.5         
 [34] evaluate_0.14         RgoogleMaps_1.4.5.3   cli_2.0.2            
 [37] rstudioapi_0.11       miniUI_0.1.1.1        sp_1.4-2             
 [40] robCompositions_2.2.1 rpart_4.1-15          jtools_2.0.5         
 [43] pls_2.7-2             zCompositions_1.3.4   sjlabelled_1.1.5     
 [46] RJSONIO_1.3-1.4       maps_3.3.0            shiny_1.5.0          
 [49] gistr_0.5.0           xfun_0.16             parameters_0.7.0     
 [52] pkgbuild_1.1.0        cluster_2.1.0         sgeostat_1.0-27      
 [55] tibble_3.0.1          permute_0.9-5         png_0.1-7            
 [58] reshape_0.8.8         withr_2.2.0           bitops_1.0-6         
 [61] ranger_0.12.1         plyr_1.8.6            cellranger_1.1.0     
 [64] pcaPP_1.9-73          e1071_1.7-3           coda_0.19-3          
 [67] pillar_1.4.6          mvoutlier_2.0.9       multcomp_1.4-13      
 [70] fs_1.4.1              flexmix_2.3-15        kernlab_0.9-29       
 [73] vctrs_0.3.1           ellipsis_0.3.1        generics_0.0.2       
 [76] nortest_1.0-4         rgdal_1.5-8           tools_4.0.2          
 [79] foreign_0.8-80        munsell_0.5.0         fastmap_1.0.1        
 [82] compiler_4.0.2        pkgload_1.1.0         abind_1.4-5          
 [85] httpuv_1.5.4          tigris_0.9.4          npmv_2.4.0           
 [88] sessioninfo_1.1.1     gridExtra_2.3         visNetwork_2.0.9     
 [91] later_1.1.0.1         jsonlite_1.6.1        WDI_2.6.0            
 [94] GGally_1.5.0          scales_1.1.1          carData_3.0-4        
 [97] estimability_1.3      lazyeval_0.2.2        promises_1.1.1       
[100] car_3.0-8             latticeExtra_0.6-29   reticulate_1.16      
[103] effectsize_0.3.1      checkmate_2.0.0       rmarkdown_2.3        
[106] openxlsx_4.1.5        sandwich_2.5-1        moments_0.14         
[109] statmod_1.4.34        webshot_0.5.2         forcats_0.5.0        
[112] pander_0.6.3          yaml_2.2.1            systemfonts_0.2.3    
[115] prabclus_2.3-2        htmltools_0.5.0       memoise_1.1.0        
[118] modeltools_0.2-23     viridisLite_0.3.0     digest_0.6.25        
[121] rrcov_1.5-2           assertthat_0.2.1      mime_0.9             
[124] rappdirs_0.3.1        repr_1.1.0            bayestestR_0.6.0     
[127] units_0.6-6           remotes_2.1.1         vegan_2.5-6          
[130] blob_1.2.1            splines_4.0.2         labeling_0.3         
[133] fpc_2.2-7             cvTools_0.3.2         hms_0.5.3            
[136] modelr_0.1.8          colorspace_1.4-1      base64enc_0.1-3      
[139] mnormt_2.0.0          tmvnsim_1.0-2         nnet_7.3-14          
[142] Rcpp_1.0.4.6          mclust_5.4.6          mvtnorm_1.1-1        
[145] fansi_0.4.1           VIM_6.0.0             truncnorm_1.0-8      
[148] R6_2.4.1              grid_4.0.2            lifecycle_0.2.0      
[151] acepack_1.4.1         labelled_2.5.0        zip_2.0.4            
[154] writexl_1.3           curl_4.3              pryr_0.1.4           
[157] minqa_1.2.4           testthat_2.3.2        snakecase_0.11.0     
[160] robustbase_0.93-6     skimr_2.1.1           desc_1.2.0           
[163] TH.data_1.0-10        htmlwidgets_1.5.1     officer_0.3.11       
[166] purrr_0.3.4           crosstalk_1.1.0.1     mgcv_1.8-31          
[169] rvest_0.3.5           insight_0.8.4         htmlTable_2.0.1      
[172] codetools_0.2-16      lubridate_1.7.9       prettyunits_1.1.1    
[175] vegawidget_0.3.1      gtable_0.3.0          DBI_1.1.0            
[178] stats4_4.0.2          httr_1.4.2            highr_0.8            
[181] KernSmooth_2.23-17    farver_2.0.3          uuid_0.1-4           
[184] diptest_0.75-7        hexbin_1.28.1         mice_3.9.0           
[187] xml2_1.3.2            ggeffects_0.14.3      readr_1.3.1          
[190] sROC_0.1-2            energy_1.7-7          DEoptimR_1.0-8       
[193] bit_1.1-15.2          sjstats_0.18.0        jpeg_0.1-8.1         
[196] pkgconfig_2.0.3       maptools_1.0-1        HardyWeinberg_1.6.6  
[199] Rsolnp_1.16          

References

1. Revelle W. Psych: Procedures for personality and psychological research. Book, 2019.

2. Polychoric T, Correlations P, Fox AJ. Package ‘polycor’. Statistics.

3. Ellis AR, Burchett WW, Harrar SW, et al. Nonparametric inference for multivariate data: The r package npmv. Journal of Statistical Software 2017; 76: 1–18.

4. Akbari H, Roshanpajouh M, Nourijelyani K, et al. Profile of drug users in the residential treatment centers of tehran, iran. Health promotion perspectives 2019; 9: 248–254.

5. Basu D, Ghosh A, Sarkar S, et al. Initial treatment dropout in patients with substance use disorders attending a tertiary care de-addiction centre in north india. The Indian journal of medical research 2017; 146: S77–S84.

6. Burchett W, Ellis A, Harrar S, et al. Nonparametric inference for multivariate data: The r package npmv. Journal of Statistical Software; 76. Epub ahead of print 2017. DOI: 10.18637/jss.v076.i04.

7. Calcagno V, Mazancourt C de. Glmulti: An r package for easy automated model selection with (generalized) linear models. Epub ahead of print 2010. DOI: 10.18637/jss.v034.i12.

8. Castillo-Carniglia Á, Marín JD, Soto-Brandt G, et al. Adaptation and validation of the instrument treatment outcomes profile to the chilean population. Journal of Substance Abuse Treatment. Epub ahead of print 2015. DOI: 10.1016/j.jsat.2015.03.002.

9. Center for Substance Abuse T. SAMHSA/csat treatment improvement protocols. In: Substance abuse treatment for persons with co-occurring disorders. Book Section, Rockville (MD): Substance Abuse; Mental Health Services Administration (US), 2005.

10. Collaborators GBD2D, Incidence I, Prevalence. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990-2017: A systematic analysis for the global burden of disease study 2017. Lancet 2018; 392: 1789–1858.

11. Control de Estupefacientes [CONACE] CN para el. Orientaciones técnicas. Tratamiento del consumo problemático de alcohol y drogas y otros trastornos de salud mental en adolescentes infractores de ley. Report, 2007.

12. Control de Estupefacientes[CONACE] CN para el. Modelo de intervención en personas con consumo problemático de sustancias psicoactivas, recluidas en los establecimientos penitenciarios chilenos. Tomo i. Elementos teóricos del programa de tratamiento, rehabilitación y reinserción social, para internos/as con consumo problemático de sustancias psicoactivas. Report, Ministerio del Interior, http://sistemas.senda.gob.cl/sistema-monitoreo/biblioteca/files/Documentos/ESTRATEGIAS%20NORMAS%20ORIENTACIONES/1%20Orientaciones%20y%20Normas/Nacional/Senda/Modelo%20de%20intervenci%C3%B3n%20de%20personas%20recluidas%20en%20establecimientos%20penitenciarios%20VOL%201.pdf (2005).

13. Control de Estupefacientes[CONACE] CN para el. Modelo de intervención en personas con consumo problemático de sustancias psicoactivas, recluidas en los establecimientos penitenciarios chilenos. Tomo ii. Programa de tratamiento, rehabilitación y reinserción social, para internos/as con consumo problemático de sustancias psicoactivas. Report, Ministerio del Interior, http://sistemas.senda.gob.cl/sistema-monitoreo/biblioteca/files/Documentos/ESTRATEGIAS%20NORMAS%20ORIENTACIONES/1%20Orientaciones%20y%20Normas/Nacional/Senda/Modelo%20de%20intervenci%C3%B3n%20de%20personas%20recluidas%20en%20establecimientos%20penitenciarios%20VOL%202.pdf (2005).

14. Copoeru I. Understanding addiction: A threefold phenomenological approach. Human Studies 2014; 37: 335–349.

15. De Boni RB, Peratikos MB, Shepherd BE, et al. Is substance use associated with hiv cascade outcomes in latin america? PLoS ONE. Epub ahead of print 2018. DOI: 10.1371/journal.pone.0194228.

16. Dixon P. VEGAN, a package of r functions for community ecology. Journal of Vegetation Science 2003; 14: 927–930.

17. Fox J. Polycor: Polychoric and polyserial correlations. Book, 2019.

18. Fukushima AR, Corrêa LT, Muñoz JWP, et al. Crack cocaine, a systematic literature review. Forensic Research & Criminology International Journal 2019; 7: 247–253.

19. Gmel G, Akre C, Astudillo M, et al. The swiss cohort study on substance use risk factors - findings of two waves. Sucht 2015; 61: 251–262.

20. Group CG on DM, Working DU2IE. Drug misuse and dependence: UK guidelines on clinical management. Report, Public Health England. Alcohol, Drugs & Tobacco Division, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/673978/clinical_guidelines_2017.pdf (2017).

21. Hansen Z. A phenomenological investigation of clinical intuition among alcohol and drug counselors. Conference Proceedings.

22. Heinze G, Armas-Castañeda G. Public policies on the use of drugs in mexico and latin america. Drug Science, Policy and Law 2015; 2: 2050324515611587.

23. Hildebrandt T, Epstein EE, Sysko R, et al. Using factor mixture models to evaluate the type a/b classification of alcohol use disorders in a heterogeneous treatment sample. Alcoholism, clinical and experimental research 2017; 41: 987–997.

24. Hjemsæter AJ, Bramness JG, Drake R, et al. Mortality, cause of death and risk factors in patients with alcohol use disorder alone or poly-substance use disorders: A 19-year prospective cohort study. BMC Psychiatry 2019; 19: 101.

25. Hsieh M-H, Tsai S-L, Tsai C-H, et al. What is the addiction world like? Understanding the lived experience of the individuals’ illicit drug addiction in taiwan. Perspectives in Psychiatric Care 2017; 53: 47–54.

26. Huidobro R, Lozier M, Oliva M. Informe resultados: Treatment outcomes profile (top) chile. Book, 2016.

27. [CICAD] I-ADACC. Report on drug use in the americas 2019. Report 978-0-8270-6793-6, Organization of American States (OAS), http://www.cicad.oas.org/main/pubs/Report%20on%20Drug%20Use%20in%20the%20Americas%202019.pdf (2019).

28. Johannessen DA, Nordfjærn T, Geirdal AØ. Change in psychosocial factors connected to coping after inpatient treatment for substance use disorder: A systematic review. Substance abuse treatment, prevention, and policy 2019; 14: 16–16.

29. Klein M. Relapse into opiate and crack cocaine misuse: A scoping review. Addiction Research & Theory 2020; 1–19.

30. Kopak AM, Combs E, Goodman K, et al. Exposure to violence and substance use treatment outcomes among female patients. Substance Use & Misuse 2019; 54: 362–372.

31. Korkmaz D S. Goksuluk, Zararsiz G. MVN: An r package for assessing multivariate normality. The R Journal 2014; 6: 151–162.

32. Lappan SN, Brown AW, Hendricks PS. Dropout rates of in-person psychosocial substance use disorder treatments: A systematic review and meta-analysis. Addiction 2020; 115: 201–217.

33. Leung J, Peacock A, Colledge S, et al. A global meta-analysis of the prevalence of hiv, hepatitis c virus, and hepatitis b virus among people who inject drugs—do gender-based differences vary by country-level indicators? The Journal of Infectious Diseases 2019; 220: 78–90.

34. Lopes-Rosa R, Kessler FP, Pianca TG, et al. Predictors of early relapse among adolescent crack users. Journal of Addictive Diseases 2017; 36: 136–143.

35. Lozano ÓM, Rojas AJ, Fernández Calderón F. Psychiatric comorbidity and severity of dependence on substance users: How it impacts on their health-related quality of life? Journal of Mental Health 2017; 26: 119–126.

36. Marín-Navarrete M R. Medina-Mora, Pérez-López A, Horigian V. Development and evaluation of addiction treatment programs in latin america. Current opinion in psychiatry 2018; 31: 306–314.

37. Maunder RG, Wiesenfeld L, Lawson A, et al. The relationship between childhood adversity and other aspects of clinical complexity in psychiatric outpatients. Journal of Interpersonal Violence 2019; 0886260519865968.

38. Medlock MM, Rosmarin DH, Connery HS, et al. Religious coping in patients with severe substance use disorders receiving acute inpatient detoxification. The American journal on addictions 2017; 26: 744–750.

39. Messas G, Fukuda L, Pienkos E. A phenomenological contribution to substance misuse treatment: Principles for person-centered care. Psychopathology 2019; 52: 1–9.

42. Moradinazar M, Najafi F, Jalilian F, et al. Prevalence of drug use, alcohol consumption, cigarette smoking and measure of socioeconomic-related inequalities of drug use among iranian people: Findings from a national survey. Substance abuse treatment, prevention, and policy 2020; 15: 39–39.

43. Drug Abuse[NIDA] NI on. Part 1: The connection between substance use disorders and mental illness. Report, National Institute on Drug Abuse, https://www.drugabuse.gov/publications/research-reports/common-comorbidities-substance-use-disorders/part-1-connection-between-substance-use-disorders-mental-illness.

44. Organization[WHO] WH. World drug report 2019 (sales no. E.19.XI.8). Conference Proceedings.

45. Pachado MP, Scherer JN, Guimarães LSP, et al. Markers for severity of problems in interpersonal relationships of crack cocaine users from a brazilian multicenter study. Psychiatric Quarterly 2018; 89: 923–936.

46. Pacurucu-Castillo SF, Ordóñez-Mancheno JM, Hernández-Cruz A, et al. World opioid and substance use epidemic: A latin american perspective. Psychiatric Research and Clinical Practice. Epub ahead of print 2019. DOI: 10.1176/appi.prcp.20180009.

47. Papadimitriou G. The "biopsychosocial model": 40 years of application in psychiatry. Psychiatriki 2017; 28: 107–110.

48. Pérez-Gómez A, Mejía-Trujillo J. The evolution of alcohol and drug prevention strategies in latin america. In: Romano JL, Israelashvili M (eds) The cambridge handbook of international prevention science. Book Section, Cambridge: Cambridge University Press, pp. 753–779.

49. Portela A, Garcia L, Goldim J. Adolescencia vulnerable: Factores biopsicosociales relacionados al uso de drogas. Revista Bioética 2015; 23: 311–319.

50. Reyes J, Pérez C, Colon H, et al. Prevalence and patterns of polydrug use in latin america: Analysis of population-based surveys in six countries. Review of European Studies; 5. Epub ahead of print 2013. DOI: 10.5539/res.v5n1p10.

51. Ronzani TM, Fuentes-Mejía C, Mota DCB, et al. Intervenciones breves para el abuso de sustancias en américa latina: Una revisión sistemática. Psicologia em Estudo; 24, http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1413-73722019000100232&nrm=iso (2019).

52. Saberi HA MB. Khanipour. Typology of substance use disorder based on temperament dimensions, addiction severity, and negative emotions. Iran J Psychiatry 2018; 13: 184–190.

53. Saberi Zafarghandi MB, Khanipour H, Ahmadi SM. Typology of substance use disorder based on temperament dimensions, addiction severity, and negative emotions. Iranian journal of psychiatry 2018; 13: 184–190.

54. Schmid M, Michaud L, Bovio N, et al. Prevalence of somatic and psychiatric morbidity across occupations in switzerland and its correlation with suicide mortality: Results from the swiss national cohort (1990–2014). BMC Psychiatry 2020; 20: 324.

55. Silva J da, Ventura CAA, Vargens OM da C, et al. Illicit drug use in seven latin american countries: Critical perspectives of families and familiars. Revista Latino-Americana de Enfermagem 2009; 17: 763–769.

56. Simoneau H, Brochu S. Addiction severity index profile of persons who reenter treatment for substance use disorders. Substance Abuse 2017; 38: 432–437.

57. Soto-Brandt G, Portilla Huidobro R, Huepe Artigas D, et al. [Validity evidence of the alcohol, smoking and substance involvement screening test (assist) in chile]. Adicciones 2014; 26: 291–302.

58. Stone J, Marsh A, Dale A, et al. Counselling guideleines: Fourth edition: Alchol and other drug issues. Book, Western Australian Government - Mental Health Commission, https://books.google.cl/books?id=K_k2xQEACAAJ (2019).

60.. 2002; 33–47.

61. Tiffany ST, Friedman L, Greenfield SF, et al. Beyond drug use: A systematic consideration of other outcomes in evaluations of treatments for substance use disorders. Addiction (Abingdon, England) 2012; 107: 709–718.

62. Valdivieso M. Sistematización de las prácticas de un programa de rehabilitación en drogas y alcohol para mujeres, con enfoque de género, que ha venido realizando el equipo de profesionales y técnicos del ctr del hospital de peñablanca. Thesis, http://repositorio.uchile.cl/handle/2250/135697 (2014).

63. Vos T, Allen C, Arora M, et al. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990–2015: A systematic analysis for the global burden of disease study 2015. The Lancet. Epub ahead of print 2016. DOI: 10.1016/S0140-6736(16)31678-6.

64. Zijlmans EAO, Ark LA van der, Tijmstra J, et al. Methods for estimating item-score reliability. Applied Psychological Measurement. Epub ahead of print 2018. DOI: 10.1177/0146621618758290.